Examples of (E)SDA Items:

Histograms

qplot(dat$ppersonspo, geom="histogram", binwidth=.05)

Boxplots

ggplot(dat@data, aes(x=decade, y=PHISPANIC))+ geom_boxplot()

Boxplot of %Hispanic in a census tract, by median year homes were built in San Antonio, TX

Scatter plots Scatterplots show bivariate relationships This can give you a visual indication of an association between two variables Positive association (positive correlation) Negative association (negative correlation) Also allows you to see potential outliers (abnormal observations) in the data

qplot(y = dat$ppersonspo, x=dat$PHISPANIC)

Scatterplot of %Hispanic in a census tract, by % in poverty, in San Antonio, TX

Parallel coordinate plots Parallel coordinate plots allow for visualization of the association between multiple variables (multivariate) Each variable is plotted according to its “coordinates” or values

ggparcoord(data=dat@data, columns = c("ppersonspo", "PHISPANIC", "PWHITE"), groupColumn = "MEDHHINC", order = "allClass", scale="uniminmax")

Parallel coordinate plot of %in poverty, %White and %Hispanic, in San Antonio, TX, shaded by the median household income of the tract.

Measuring Spatial Autocorrelation If we observe data Z(s) (an attribute) at location i, and again at location j The spatial autocorrelation between \(Z(s)_i\) and \(Z(s)_j\) is degree of similarity between them, measured as the standardized covariance between their locations and values. In the absence of spatial autocorrelation the locations of \(Z(s)_i\) and \(Z(s)_j\) has nothing to do with the values of \(Z(s)_i\) and \(Z(s)_j\) OTOH, if autocorrelation is present, close proximity of \(Z(s)_i\) and \(Z(s)_j\) leads to close values of their attributes.

Positive autocorrelation This means that a feature is positively associated with the values of the surrounding area (as defined by the spatial weight matrix), high values occur with high values, and low with low

Negative autocorrelation This means that a feature is negatively associated with the values of the surrounding area (as defined by the spatial weight matrix), high with low, low with high The (probably) most popular global autocorrelation statistic is Moran’s I:

\(I = \frac{n}{(n - 1)\sigma^2 w_{..}} \sum^i_n \sum^j_n w_{ij} (Z(s_i) - \bar Z)(Z(s_j) - \bar Z)\)

with \(Z(s)_i\) being the value of the attribute at location i, \(Z(s)_j\) being the value of the attribute at location j, \(\sigma^2\) is sample variance, \(w_{ij}\) is the weight for location ij (0 if they are not neighbors, 1 otherwise).

If I is greater than E(I), there is global positive autocorrelation, if I is less than E(I), there is global negative autocorrelation. Where E(I) is the expected value of I.

\(E(I)=-\frac{1}{n-1}\)

Moran’s I Scatterplot It is sometimes useful to visualize the relationship between the actual values of the dependent variable and their lagged values. This is the so called Moran scatterplot Lagged values are the average value of the surrounding neighborhood around location i lag(Z) = \(z_{ij} * w_{ij}\) = \(z'W\) in matrix terms The Moran scatterplot shows the association between the observation of interest and its neighborhood’s average value. The variables are generally plotted as z-scores,to avoid scaling issues.

Moran Scatter plot for San Antonio Poverty Rates

Here is the overall poverty rate map for San Antonio, the Global Moran’s I was .655.

And here we show the Moran scatterplot:

#Moran Scatter plot
moran.plot(as.numeric(scale(dat$ppersonspo)), listw=salw, xlab="Poverty Rate", ylab="Lagged Poverty Rate", main=c("Moran Scatterplot for Poverty", "in San Antonio. I=.655") )

Moran’s I is basically a correlation think Pearson’s \(\rho\) on a variable and a spatially lagged* version of itself. Spatial lags of a variable are done via multiplying the variable through the spatial weight matrix for the data.

If we have a value \(Z(s_i)\) at location i and a spatial weight matrix \(w_{ij}\) describing the spatial neighborhood around location i, we can find the lagged value of the variable by: \(WZ_i = Z(s_i) * w_{ij}\)

This calculates what is effectively, the neighborhood average value in locations around location i, often stated \(Z(s_{-i})\)

Again, if we had the adjacency matrix from above, a Rook-based adjacency weight matrix.

\[ w_{ij} = \begin{bmatrix} 0 & 1 & 1 & 0\\ 1 & 0 & 0 & 1 \\ 1 & 0 & 0& 1\\ 0& 1& 1 & 0 \end{bmatrix} \]

Typically this matrix is standardized, by dividing each element of \(w_{ij}\) by the number of neighbors, this is called row-standardized:

\[ w_{ij} = \begin{bmatrix} 0 & .5 & .5 & 0\\ .5 & 0 & 0 & .5 \\ .5 & 0 & 0& .5\\ 0& .5& .5 & 0 \end{bmatrix} \]

and a variable z, equal to:

\[z=\begin{bmatrix} 1 & 2 & 3 & 4 \end{bmatrix}\]

When we form the product: \(z'W\), we get:

\[z_{lag}=\begin{bmatrix} 3.5 & 3 & 3 & 3.5 \end{bmatrix}\]

Which, now we see where we get the y-value of the moran scatterplot. It is just the lagged version of the original variable.

Bivariate Moran Plots The so-called bivariate Moran statistic, is just the correlation of a lagged variable, y, with another variable, x. An example from San Antonio, is a bivariate Moran analysis of Poverty and Crime rates.

## [1] 0.1283409

Which is a pretty low correlation (I = .125). This shows slight positive autocorrelation, meaning that areas with higher crime rates typically surround areas with higher poverty rates.

Local Moran’s I So far, we have only seen a Global statistic for autocorrelation, and this tells us if there is any overall clustering in our data. We may be more interested in where the autocorrelation occurs, or where clusters are located. A local version of the Moran statistic is avaialble as well. This basically calculates the Moran statistic from above, but only for the local neighborhood. It compares the observation’s value to the local neighborhood average, instead of the global average. Anselin (1995) referred to this as a “LISA” statistic, for Local Indicator of Spatial Autocorrelation.

Here is a LISA map for clusters of poverty in San Antonio: LISA

which shows areas of low poverty clustering in blue, and high poverty clustering in red. These are so-called clusters, becuase they are areas with higher (or lower, for the blues) than average poverty rates, surrounded by areas with with higher than average poverty rates. The red clusters are so called “high-high clusters”, likewise the blue areas are called “low-low clusters”. We also see light pink and light blue polygons. The light pink polygons represent areas that have high poverty rates, but are in a low poverty spatial neighborhood, and are called high-low outliers. The one in the the northwest part of San Antonio is a good example, because it’s the UTSA main campus, a student ghetto, if you will.

Likewise, the light blue polygons, are called low-high outliers, becuase they have low poverty rates, but are in a high poverty spatial neighborhood.

What these methods tell you * Moran’s I is a descriptive statistic ONLY, * It simply indicates if there is spatial association/autocorrelation in a variable * Local Moran’s I tells you if there is significant localized clustering of the variable

1 Empirical Example

Geoda is very adept at doing this stuff, but alas, it’s point and click. Coding means you have something to come back to when you want to do the same job again, meaning your research is reproducible. What if you click the wrong button next time?

First we load some libraries we need and the data, which is stored in an ESRI shapefile. All data can be found on Github.

library(spdep)
library(maptools)
library(RColorBrewer)
library(classInt)
dat<-readShapePoly("SA_classdata.shp", proj4string=CRS("+proj=utm +zone=14 +north"))

#create a mortality rate, 3 year average
dat$mort3<-apply(dat@data[, c("deaths09", "deaths10", "deaths11")],1,sum)
dat$mortrate<-1000*dat$mort3/dat$acs_poptot

#get rid of tracts with very low populations
dat<-dat[which(dat$acs_poptot>100),]

#Show a basic quantile map of the mortality rate
spplot(dat, "mortrate",
       main="Mortality Rate, 3 Year Avg, 2009-2011",
       at=quantile(dat$mortrate), 
       col.regions=brewer.pal(n=5, "Blues"))

#Make a rook style weight matrix
sanb<-poly2nb(dat, queen=F)
summary(sanb)
## Neighbour list object:
## Number of regions: 234 
## Number of nonzero links: 1094 
## Percentage nonzero weights: 1.997955 
## Average number of links: 4.675214 
## Link number distribution:
## 
##  1  2  3  4  5  6  7  8  9 
##  4 11 29 63 68 31 23  3  2 
## 4 least connected regions:
## 61 82 147 205 with 1 link
## 2 most connected regions:
## 31 55 with 9 links
salw<-nb2listw(sanb, style="W")
#the style = W row-standardizes the matrix

#make a k-nearest neighbor weight set, with k=3
knn<-knearneigh(x = coordinates(dat), k = 3)
knn3<-knn2nb(knn = knn)

#Visualize the neighbors
plot(dat, main="Rook Neighbors")
plot(salw, coords=coordinates(dat), add=T, col=2)

plot(dat, main="k=3 Neighbors")
plot(knn3, coords=coordinates(dat), add=T, col=2)

#generate locally weighted values
wmat<-nb2mat(sanb, style="W")
dat$mort.w<-wmat%*%(dat$mortrate)

#plot the raw mortality rate and the spatiallly averaged rate.
spplot(dat, c("mortrate", "mort.w"),
       at=quantile(dat$mortrate), 
       col.regions=brewer.pal(n=5, "Blues"),
       main="Mortality Rate, 3 Year Avg, 2009-2011")

#Calculate univariate moran's I for the mortality rate
moran.test(dat$mortrate, listw=salw)
## 
##  Moran I test under randomisation
## 
## data:  dat$mortrate  
## weights: salw  
## 
## Moran I statistic standard deviate = 10.815, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##       0.472820119      -0.004291845       0.001946053
moran.mc(dat$mortrate, listw=salw, nsim=999)
## 
##  Monte-Carlo simulation of Moran I
## 
## data:  dat$mortrate 
## weights: salw  
## number of simulations + 1: 1000 
## 
## statistic = 0.47282, observed rank = 1000, p-value = 0.001
## alternative hypothesis: greater
#Moran Scatter plot
moran.plot(dat$mortrate, listw=salw)

#Local Moran Analysis
locali<-localmoran(dat$mortrate, salw, p.adjust.method="fdr")
dat$locali<-locali[,1]
dat$localp<-locali[,5]

#We have to make our own cluster identifiers
dat$cl<-as.factor(ifelse(dat$localp<=.05,"Clustered","NotClustered"))

#Plots of the results
spplot(dat, "locali", main="Local Moran's I", at=quantile(dat$locali), col.regions=brewer.pal(n=4, "RdBu"))

spplot(dat, "cl", main="Local Moran Clusters", col.regions=c(2,0))

#Local Getis-Org G analysis
localg<-localG(dat$mortrate, salw)
dat$localg<-as.numeric(localg)

#Plots
spplot(dat, "localg", main="Local Geary's G", at=c(-4, -3,-2,-1,0,1,2,3, 4), col.regions=brewer.pal(n=8, "RdBu"))

2 Introduction to Spatial Regression Models

Up until now, we have been concerned with describing the structure of spatial data through correlational, and the methods of exploratory spatial data analysis.

Through ESDA, we examined data for patterns and using the Moran I and Local Moran I statistics, we examined clustering of variables. Now we consider regression models for continuous outcomes. We begin with a review of the Ordinary Least Squares model for a continuous outcome.

2.1 OLS Model

The basic OLS model is an attempt to estimate the effect of an independent variable(s) on the value of a dependent variable. This is written as: \(y_i = \beta_0 + \beta_1 * x_i + e_i\)

where y is the dependent variable that we want to model, x is the independent variable we think has an association with y, \(\beta_0\) is the model intercept, or grand mean of y, when x = 0, and \(\beta_1\) is the slope parameter that defines the strength of the linear relationship between x and y. e is the error in the model for y that is unaccounted for by the values of x and the grand mean \(\beta_0\). The average, or expected value of y is : \(E[y|x] = \beta_0 + \beta_1 * x_i\), which is the linear mean function for y, conditional on x, and this gives us the customary linear regression plot:

set.seed(1234)
x<- rnorm(100, 10, 5)
beta0<-1
beta1<-1.5
y<-beta0+beta1*x+rnorm(100, 0, 5)

plot(x, y)
abline(coef = coef(lm(y~x)), lwd=1.5)

summary(lm(y~x))$coef
##             Estimate Std. Error   t value     Pr(>|t|)
## (Intercept) 1.446620  1.0879494  1.329676 1.867119e-01
## x           1.473915  0.1037759 14.202863 1.585002e-25

Where, the line shows \(E[y|x] = \beta_0 + \beta_1 * x_i\)

We assume that the errors, \(e_i \sim N(0, \sigma^2)\) are independent, Normally distributed and homoskdastic, with variances \(\sigma^2\).

This is the simple model with one predictor. We can easily add more predictors to the equation and rewrite it: \(y = \beta_0 + \sum^k \beta_k * x_{ik} + e_i\)

So, now the mean of y is modeled with multiple x variables. We can write this relationship more compactly using matrix notation:

\(Y = X ' \beta + e\)

Where Y is now a \(n*1\) vector of observations of our dependent variable, X is a \(n*k\) matrix of independent variables, with the first column being all 1’s and e is the \(n*1\) vector of errors for each observation.

In matrices this looks like: \[y = \begin{bmatrix} y_1 \\ y_2 \\ \vdots \\ y_n \end{bmatrix}\]

\[\beta = \begin{bmatrix} \beta_0 \\ \beta_1 \\ \vdots \\ \beta_k \end{bmatrix}\]

\[x=\begin{bmatrix} 1 & x_{1,1} & x_{1,2} & \dots & x_{1, k}\\ 1 & x_{2,1} & x_{1,2} & \dots & x_{1, k} \\ 1 &\vdots & \vdots & \vdots & \vdots \\ 1 & x_{n,1} & x_{n,2} & \dots & x_{n, k} \end{bmatrix}\]

\[ e = \begin{bmatrix} e_1 \\ e_2 \\ \vdots \\ e_n \end{bmatrix}\]

The residuals are uncorrelated, with covariance matrix \(\Sigma\) =

\[ \Sigma = \sigma^2 I = \sigma^2 * \begin{bmatrix} 1 & 0 & 0 & \dots & 0\\ 0 & 1 & 0 & \dots & 0 \\ 0 & \vdots & \vdots & \dots & \vdots \\ 0 & 0 & 0 & \dots & 1 \end{bmatrix} = \begin{bmatrix} \sigma^2 & 0 & 0 & \dots & 0\\ 0 & \sigma^2 & 0 & \dots & 0 \\ 0 & \vdots & \vdots & \dots & \vdots \\ 0 & 0 & 0 & \dots & \sigma^2 \end{bmatrix}\]

To estimate the \(\beta\) coefficients, we use the customary OLS estimator \(\beta = (X'X)^{-1} (X' Y)\)

this is the estimator that minimizes the residual sum of squares: \((Y - X ' \beta)' (Y - X' \beta)\)

or

\((Y - \hat Y)' (Y - \hat Y)\)

We can inspect the properties of the estimates by examining the residuals, or \(e_i\) of the model. Since we assume the data are normal, a quantile-quantile (Q-Q) plot of the residuals against the expected quantile of the standard normal distribution should be a straight line. Formal tests of normality can also be used.

fit<-lm(y~x)
qqnorm(rstudent(fit))
qqline(rstudent(fit))

shapiro.test(resid(fit))
## 
##  Shapiro-Wilk normality test
## 
## data:  resid(fit)
## W = 0.98878, p-value = 0.5677
ad.test(resid(fit))
## 
##  Anderson-Darling normality test
## 
## data:  resid(fit)
## A = 0.39859, p-value = 0.3593
lillie.test(resid(fit))
## 
##  Lilliefors (Kolmogorov-Smirnov) normality test
## 
## data:  resid(fit)
## D = 0.052017, p-value = 0.7268

We may also inspect the association between \(e_i\), or more appropriately the studentized/standardized residuals, and the predictors and the dependent variable. If we see evidence of association, then homoskedasticity is a poor assumption

par(mfrow=c(2,2))
plot(fit)

par(mfrow=c(1,1))

2.1.1 Model-data agreement

Do we (meaning our data) meet the statistical assumptions of our analytical models?
Always ask this of any analysis you do, if your model is wrong, your inference will also be wrong.

Since spatial data often display correlations amongst closely located observations (autocorrelation), we should probably test for autocorrelation in the model residuals, as that would violate the assumptions of the OLS model.

One method for doing this is to calculate the value of Moran’s I for the OLS residuals.

library(spdep)
library(maptools)
library(RColorBrewer)
dat<-readShapePoly("SA_classdata.shp", proj4string=CRS("+proj=utm +zone=14 +north"))
#Make a rook style weight matrix
sanb<-poly2nb(dat, queen=F)
summary(sanb)
## Neighbour list object:
## Number of regions: 235 
## Number of nonzero links: 1106 
## Percentage nonzero weights: 2.002716 
## Average number of links: 4.706383 
## Link number distribution:
## 
##  1  2  3  4  5  6  7  8  9 
##  4 10 30 62 66 34 24  3  2 
## 4 least connected regions:
## 61 82 147 205 with 1 link
## 2 most connected regions:
## 31 55 with 9 links
salw<-nb2listw(sanb, style="W")

fit2<-lm(I(viol3yr/acs_poptot) ~ pfemhh + hwy + p5yrinmig + log(MEDHHINC), data=dat )

dat$resid<-rstudent(fit2)
spplot(dat, "resid",at=quantile(dat$resid), col.regions=brewer.pal(n=5, "RdBu"), main="Residuals from OLS Fit of Crime Rate")

lm.morantest(fit2, listw = salw, resfun = rstudent)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = I(viol3yr/acs_poptot) ~ pfemhh + hwy +
## p5yrinmig + log(MEDHHINC), data = dat)
## weights: salw
## 
## Moran I statistic standard deviate = 0.75475, p-value = 0.2252
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##      0.021326432     -0.011406176      0.001880845

Which, in this case, there appears to be no clustering in the residuals, since the observed value of Moran’s I is .021, with a z-test of 0.75, p= .225.

2.1.2 Extending the OLS model to accommodate spatial structure

If we now assume we measure our Y and X’s at specific spatial locations (s), so we have Y(s) and X(s).

In most analysis, the spatial location (i.e. the county or census tract) only serves to link X and Y so we can collect our data on them, and in the subsequent analysis this spatial information is ignored that explicitly considers the spatial relationships between the variables or the locations.

In fact, even though we measure Y(s) and X(s) what we end up analyzing X and Y, and apply the ordinary regression methods on these data to understand the effects of X on Y.

Moreover, we could move them around in space (as long as we keep the observations together \(y_i\) with \(x_i\)) and still get the same results. Such analyses have been called a-spatial. This is the kind of regression model you are used to fitting, where we ignore any information on the locations of the observations themselves.

However, we can extend the simple regression case to include the information on (s) and incorporate it into our models explicitly, so they are no longer a-spatial.

There are several methods by which to incorporate the (s) locations into our models, there are several alternatives to use on this problem:

  • The structured linear mixed (multi-level) model, or GLMM (generalized linear mixed model)
  • Spatial filtering of observations
  • Spatially autoregressive models
  • Geographically weighted regression

We will first deal with the case of the spatially autoregressive model, or SAR model, as its structure is just a modification of the OLS model from above.

2.2 Spatially autoregressive models

We saw in the normal OLS model that some of the basic assumptions of the model are that the: 1) model residuals are distributed as iid standard normal random variates 2) and that they have common (and constant, meaning homoskedastic) unit variance.

Spatial data, however present a series of problems to the standard OLS regression model. These problems are typically seen as various representations of spatial structure or dependence within the data. The spatial structure of the data can introduce spatial dependence into both the outcome, the predictors and the model residuals.

This can be observed as neighboring observations, both with high (or low) values (positive autocorrelation) for either the dependent variable, the model predictors or the model residuals. We can also observe situations where areas with high values can be surrounded by areas with low values (negative autocorrelation).

Since the standard OLS model assumes the residuals (and the outcomes themselves) are uncorrelated, as previous stated, the autocorrelation inherent to most spatial data introduces factors that violate the iid distributional assumptions for the residuals, and could violate the assumption of common variance for the OLS residuals. To account for the expected spatial association in the data, we would like a model that accounts for the spatial structure of the data. One such way of doing this is by allowing there to be correlation between residuals in our model, or to be correlation in the dependent variable.

We are familiar with the concept of autoregression amongst neighboring observations. This concept is that a particular observation is a linear combination of its neighboring values. This autoregression introduces dependence into the data. Instead of specifying the autoregression structure directly, we introduce spatial autocorrelation through a global autocorrelation coefficient and a spatial proximity measure.

There are 2 basic forms of the spatial autoregressive model: the spatial lag and the spatial error models.

Both of these models build on the basic OLS regression model: $ Y = dots X ’ + e$

Where Y is the dependent variable, X is the matrix of independent variables, \(\beta\) is the vector of regression parameters to be estimated from the data, and e are the model residuals, which are assumed to be distributed as a Gaussian random variable with mean 0 and constant variance-covariance matrix \(\Sigma\) .

2.2.1 The spatial lag model

The spatial lag model introduces autocorrelation into the regression model by lagging the dependent variables themselves, much like in a time-series approach .
The model is specified as: \(Y= \rho W Y + X '\beta +e\)

where \(\rho\) is the autoregressive coefficient, which tells us how strong the resemblance is, on average, between \(Y_i\) and it’s neighbors. The matrix ** W** is the spatial weight matrix, describing the spatial network structure of the observations, like we described in the ESDA lecture.

In the lag model, we are specifying the spatial component on the dependent variable. This leads to a spatial filtering of the variable, where they are averaged over the surrounding neighborhood defined in W, called the spatially lagged variable.

The specification that is used most often is a spatially filtered Y variable that can then be regressed on X, which can directly be seen in a re-expression of the OLS model as:

\(Y= \rho WY + X' \beta + e\)

where the direct effect of the spatial lagging of the dependent variable is seen.

To estimate these models we can use either GeoDa or R in R we use the spdep package, and the lagsarlm() function

The lag model is:

fit.lag<-lagsarlm(I(viol3yr/acs_poptot) ~ pfemhh + hwy + p5yrinmig + log(MEDHHINC), data=dat, listw = salw) 
summary(fit.lag, Nagelkerke=T)
## 
## Call:
## lagsarlm(formula = I(viol3yr/acs_poptot) ~ pfemhh + hwy + p5yrinmig + 
##     log(MEDHHINC), data = dat, listw = salw)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0635446 -0.0147641 -0.0036721  0.0090372  0.3252902 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                 Estimate Std. Error z value  Pr(>|z|)
## (Intercept)    0.3136449  0.0924307  3.3933 0.0006906
## pfemhh         0.1913535  0.0336049  5.6942 1.239e-08
## hwy            0.0075802  0.0056013  1.3533 0.1759604
## p5yrinmig      0.0794330  0.0202592  3.9208 8.824e-05
## log(MEDHHINC) -0.0337148  0.0082612 -4.0811 4.482e-05
## 
## Rho: 0.034756, LR test value: 0.18517, p-value: 0.66697
## Asymptotic standard error: 0.082235
##     z-value: 0.42264, p-value: 0.67256
## Wald statistic: 0.17862, p-value: 0.67256
## 
## Log likelihood: 441.5604 for lag model
## ML residual variance (sigma squared): 0.0013657, (sigma: 0.036955)
## Nagelkerke pseudo-R-squared: 0.36486 
## Number of observations: 235 
## Number of parameters estimated: 7 
## AIC: -869.12, (AIC for lm: -870.94)
## LM test for residual autocorrelation
## test value: 0.4691, p-value: 0.4934

We see that \(\rho\) is estimated to be .034, and the likelihood ratio test shows that this is not significantly different from 0.

2.2.2 The spatial error model

The spatial error model says that the autocorrelation is not in the outcome itself, but instead, any autocorrelation is attributable to there being missing spatial covariates in the data. If these spatially patterned covariates could be measures, the tne autocorrelation would be 0. This model is written:

\(Y= X' \beta +e\)

\(e=\lambda W e + v\)

This model, in effect, controls for the nuisance of correlated errors in the data that are attributable to an inherently spatial process, or to spatial autocorrelation in the measurement errors of the measured and possibly unmeasured variables in the model.

This model is estimated in R using errorsarlm() in the spdep library

fit.err<-errorsarlm(I(viol3yr/acs_poptot) ~ pfemhh + hwy + p5yrinmig + log(MEDHHINC), data=dat, listw = salw) 
summary(fit.err, Nagelkerke=T)
## 
## Call:
## errorsarlm(formula = I(viol3yr/acs_poptot) ~ pfemhh + hwy + p5yrinmig + 
##     log(MEDHHINC), data = dat, listw = salw)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0650577 -0.0141501 -0.0034659  0.0092839  0.3241926 
## 
## Type: error 
## Coefficients: (asymptotic standard errors) 
##                 Estimate Std. Error z value  Pr(>|z|)
## (Intercept)    0.3138785  0.0922430  3.4027 0.0006671
## pfemhh         0.1971225  0.0340765  5.7847 7.264e-09
## hwy            0.0073140  0.0057043  1.2822 0.1997763
## p5yrinmig      0.0781620  0.0205345  3.8064 0.0001410
## log(MEDHHINC) -0.0337316  0.0082489 -4.0892 4.328e-05
## 
## Lambda: 0.070725, LR test value: 0.53188, p-value: 0.46582
## Asymptotic standard error: 0.094212
##     z-value: 0.7507, p-value: 0.45283
## Wald statistic: 0.56355, p-value: 0.45283
## 
## Log likelihood: 441.7338 for error model
## ML residual variance (sigma squared): 0.0013625, (sigma: 0.036912)
## Nagelkerke pseudo-R-squared: 0.3658 
## Number of observations: 235 
## Number of parameters estimated: 7 
## AIC: -869.47, (AIC for lm: -870.94)

We see \(\lambda\) = .071, with a p-value of .465, suggesting again that, in this case, there is no autocorrelation in the model residuals.

We can examine the relative fits of each model by extracting the AIC values from each:

AIC(fit.lag)
## [1] -869.1208
AIC(fit.err)
## [1] -869.4675
AIC(fit2)
## [1] -870.9356

Which, while slightly lower than the OLS model, show little evidence of favoring the spatial regression models in this case.

2.3 Examination of Model Specification

To some degree, both of the SAR specifications allow us to model spatial dependence in the data. The primary difference between them is where we model said dependence.

The lag model says that the dependence affects the dependent variable only, we can liken this to a diffusion scenario, where your neighbors have a diffusive effect on you.

The error model says that dependence affects the residuals only. We can liken this to the missing spatially dependent covariate situation, where, if only we could measure another really important spatially associated predictor, we could account for the spatial dependence. But alas, we cannot, and we instead model dependence in our errors.

These are inherently two completely different ways to think about specifying a model, and we should really make our decision based upon how we think our process of interest operates.

That being said, this way of thinking isn’t necessarily popular among practitioners. Most practitioners want the best fitting model, ‘nuff said. So methods have been developed that test for alternate model specifications, to see which kind of model best summarizes the observed variation in the dependent variable and the spatial dependence.

These are a set of so-called Lagrange Multiplier (econometrician’s jargon for a score test) test. These tests compare the model fits from the OLS, spatial error, and spatial lag models using the method of the score test.

For those who don’t remember, the score test is a test based on the relative change in the first derivative of the likelihood function around the maximum likelihood. The particular thing here that is affecting the value of this derivative is the autoregressive parameter, \(\rho\) or \(\lambda\). In the OLS model \(\rho\) or \(\lambda\) = 0 (so both the lag and error models simplify to OLS), but as this parameter changes, so does the likelihood for the model, hence why the derivative of the likelihood function is used. This is all related to how the estimation routines estimate the value of \(\rho\) or \(\lambda\).

2.3.1 Using the Lagrange Multiplier Test (LMT)

In general, you fit the OLS model to your dependent variable, then submit the OLS model fit to the LMT testing procedure.

Then you look to see which model (spatial error, or spatial lag) has the highest value for the test.

Enter the uncertainty…

So how much bigger, you might say?

Well, drastically bigger, if the LMT for the error model is 2500 and the LMT for the lag model is 2480, this is NOT A BIG DIFFERENCE, only about 1%. If you see a LMT for the error model of 2500 and a LMT for the lag model of 250, THIS IS A BIG DIFFERENCE.

So what if you don’t see a BIG DIFFERENCE, HOW DO YOU DECIDE WHICH MODEL TO USE???

Well, you could think more, but who has time for that.

The econometricians have thought up a “better” LMT test, the so-called robust LMT, robust to what I’m not sure, but it is said that it can settle such problems of a “not so big difference” between the lag and error model specifications.

So what do you do? In general, think about your problem before you run your analysis, should this fail you, proceed with using the LMT, if this is inconclusive, look at the robust LMT, and choose the model which has the larger value for this test.

3 More Data Examples:

3.0.1 San Antonio, TX mortality data

#Spatial Regression Models 1

#, proj4string=CRS("+proj=utm zone=14")
dat<-readShapePoly("SA_classdata.shp")
dat<-dat[which(dat$acs_poptot>100),]

#Create a good representative set of neighbor types
sa.nb6<-knearneigh(coordinates(dat), k=6)
sa.nb6<-knn2nb(sa.nb6)
sa.wt6<-nb2listw(sa.nb6, style="W")

sa.nb5<-knearneigh(coordinates(dat), k=5)
sa.nb5<-knn2nb(sa.nb5)
sa.wt5<-nb2listw(sa.nb5, style="W")

sa.nb4<-knearneigh(coordinates(dat), k=4)
sa.nb4<-knn2nb(sa.nb4)
sa.wt4<-nb2listw(sa.nb4, style="W")

sa.nb3<-knearneigh(coordinates(dat), k=3)
sa.nb3<-knn2nb(sa.nb3)
sa.wt3<-nb2listw(sa.nb3,style="W")

sa.nb2<-knearneigh(coordinates(dat), k=2)
sa.nb2<-knn2nb(sa.nb2)
sa.wt2<-nb2listw(sa.nb2,style="W")

sa.nbr<-poly2nb(dat, queen=F)
sa.wtr<-nb2listw(sa.nbr, zero.policy=T)

sa.nbq<-poly2nb(dat, queen=T)
sa.wtq<-nb2listw(sa.nbr, style="W", zero.policy=T)

sa.nbd<-dnearneigh(coordinates(dat), d1=0, d2=10000)
sa.wtd<-nb2listw(sa.nbd, zero.policy=T)

#create a mortality rate, 3 year average
dat$mort3<-apply(dat@data[, c("deaths09", "deaths10", "deaths11")],1,mean)
dat$mortrate<-1000*dat$mort3/dat$acs_poptot

#just a 
hist(dat$mortrate)

#do some basic regression models, without spatial structure

fit.1<-lm(scale(mortrate)~scale(ppersonspo)+scale(I(viol3yr/acs_poptot))+scale(dissim)+scale(ppop65plus), data=dat)
summary(fit.1)
## 
## Call:
## lm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(I(viol3yr/acs_poptot)) + 
##     scale(dissim) + scale(ppop65plus), data = dat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.96044 -0.27804 -0.00673  0.26359  2.18006 
## 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  1.483e-16  3.120e-02   0.000   1.0000    
## scale(ppersonspo)            1.215e-01  5.047e-02   2.407   0.0169 *  
## scale(I(viol3yr/acs_poptot)) 2.287e-01  3.841e-02   5.953  9.8e-09 ***
## scale(dissim)                8.467e-02  4.817e-02   1.758   0.0801 .  
## scale(ppop65plus)            7.240e-01  3.594e-02  20.146  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4773 on 229 degrees of freedom
## Multiple R-squared:  0.7761, Adjusted R-squared:  0.7722 
## F-statistic: 198.4 on 4 and 229 DF,  p-value: < 2.2e-16
vif(fit.1)
##            scale(ppersonspo) scale(I(viol3yr/acs_poptot)) 
##                     2.605367                     1.508595 
##                scale(dissim)            scale(ppop65plus) 
##                     2.372665                     1.320878
par(mfrow=c(2,2))
plot(fit.1)

par(mfrow=c(1,1))
#this is a test for constant variance
bptest(fit.1)  #looks like have heteroskedasticity
## 
##  studentized Breusch-Pagan test
## 
## data:  fit.1
## BP = 51.088, df = 4, p-value = 2.14e-10
#extract studentized residuals from the fit, and examine them
dat$residfit1<-rstudent(fit.1)

cols<-brewer.pal(5,"RdBu")
spplot(dat,"residfit1", at=quantile(dat$residfit1), col.regions=cols, main="Residuals from Model fit of Mortality Rate")

Chi and Zhu suggest using a wide array of neighbor specifications, then picking the one that maximizes the autocorrelation coefficient. So, here I emulate their results:

#test for residual autocorrelation 
resi<-c(lm.morantest(fit.1, listw=sa.wt2)$estimate[1],
lm.morantest(fit.1, listw=sa.wt3)$estimate[1],
lm.morantest(fit.1, listw=sa.wt4)$estimate[1],
lm.morantest(fit.1, listw=sa.wt5)$estimate[1],
lm.morantest(fit.1, listw=sa.wt6)$estimate[1],
lm.morantest(fit.1, listw=sa.wtd, zero.policy=T)$estimate[1],
lm.morantest(fit.1, listw=sa.wtq)$estimate[1],
lm.morantest(fit.1, listw=sa.wtr)$estimate[1])
plot(resi, type="l")

lm.morantest(fit.1, listw=sa.wt2)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(I(viol3yr/acs_poptot)) + scale(dissim) + scale(ppop65plus),
## data = dat)
## weights: sa.wt2
## 
## Moran I statistic standard deviate = -1.3282, p-value = 0.908
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     -0.089262137     -0.010642737      0.003503515
lm.morantest(fit.1, listw=sa.wt3)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(I(viol3yr/acs_poptot)) + scale(dissim) + scale(ppop65plus),
## data = dat)
## weights: sa.wt3
## 
## Moran I statistic standard deviate = 0.10133, p-value = 0.4596
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     -0.005775111     -0.010724844      0.002386190
lm.morantest(fit.1, listw=sa.wt4)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(I(viol3yr/acs_poptot)) + scale(dissim) + scale(ppop65plus),
## data = dat)
## weights: sa.wt4
## 
## Moran I statistic standard deviate = 0.90538, p-value = 0.1826
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##       0.02812192      -0.01050301       0.00182003
lm.morantest(fit.1, listw=sa.wt5)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(I(viol3yr/acs_poptot)) + scale(dissim) + scale(ppop65plus),
## data = dat)
## weights: sa.wt5
## 
## Moran I statistic standard deviate = 1.3996, p-value = 0.08082
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##       0.04315932      -0.01029146       0.00145856
lm.morantest(fit.1, listw=sa.wt6)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(I(viol3yr/acs_poptot)) + scale(dissim) + scale(ppop65plus),
## data = dat)
## weights: sa.wt6
## 
## Moran I statistic standard deviate = 1.7095, p-value = 0.04368
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##      0.049203016     -0.010078442      0.001202519
lm.morantest(fit.1, listw=sa.wtd, zero.policy=T)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(I(viol3yr/acs_poptot)) + scale(dissim) + scale(ppop65plus),
## data = dat)
## weights: sa.wtd
## 
## Moran I statistic standard deviate = 2.6709, p-value = 0.003782
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     1.901642e-02    -6.929932e-03     9.436939e-05
lm.morantest(fit.1, listw=sa.wtq)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(I(viol3yr/acs_poptot)) + scale(dissim) + scale(ppop65plus),
## data = dat)
## weights: sa.wtq
## 
## Moran I statistic standard deviate = 0.65308, p-value = 0.2569
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##      0.017905792     -0.010601820      0.001905428
lm.morantest(fit.1, listw=sa.wtr)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(I(viol3yr/acs_poptot)) + scale(dissim) + scale(ppop65plus),
## data = dat)
## weights: sa.wtr
## 
## Moran I statistic standard deviate = 0.65308, p-value = 0.2569
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##      0.017905792     -0.010601820      0.001905428
#looks like we have minimal autocorrelation in our residuals, but the distance based
#weight does show significant autocorrelation

#Let's look at the local autocorrelation in our residuals
#get the values of I
dat$lmfit1<-localmoran(dat$mortrate, sa.wt5, zero.policy=T)[,1]
brks<-classIntervals(dat$lmfit1, n=5, style="quantile")
spplot(dat, "lmfit1", at=brks$brks
, col.regions=brewer.pal(5, "RdBu"), main="Local Moran Plot of Mortality Rate")

#Now we fit the spatial lag model 
#The lag mode is fit with lagsarlm() in the spdep library
#we basically specify the same model as in the lm() fit above
#But we need to specify the spatial weight matrix and the type
#of lag model to fit

fit.lag<-lagsarlm(scale(mortrate)~scale(ppersonspo)+scale(I(viol3yr/acs_poptot))+scale(dissim)+scale(ppop65plus), data=dat, listw=sa.wt2, type="lag")
summary(fit.1)
## 
## Call:
## lm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(I(viol3yr/acs_poptot)) + 
##     scale(dissim) + scale(ppop65plus), data = dat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.96044 -0.27804 -0.00673  0.26359  2.18006 
## 
## Coefficients:
##                               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                  1.483e-16  3.120e-02   0.000   1.0000    
## scale(ppersonspo)            1.215e-01  5.047e-02   2.407   0.0169 *  
## scale(I(viol3yr/acs_poptot)) 2.287e-01  3.841e-02   5.953  9.8e-09 ***
## scale(dissim)                8.467e-02  4.817e-02   1.758   0.0801 .  
## scale(ppop65plus)            7.240e-01  3.594e-02  20.146  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4773 on 229 degrees of freedom
## Multiple R-squared:  0.7761, Adjusted R-squared:  0.7722 
## F-statistic: 198.4 on 4 and 229 DF,  p-value: < 2.2e-16
summary(fit.lag, Nagelkerke=T)
## 
## Call:
## lagsarlm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(I(viol3yr/acs_poptot)) + 
##     scale(dissim) + scale(ppop65plus), data = dat, listw = sa.wt2, 
##     type = "lag")
## 
## Residuals:
##         Min          1Q      Median          3Q         Max 
## -1.91953751 -0.28320690  0.00039592  0.25188870  2.28912609 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                                Estimate Std. Error z value  Pr(>|z|)
## (Intercept)                  -0.0035113  0.0307209 -0.1143    0.9090
## scale(ppersonspo)             0.1047111  0.0509073  2.0569    0.0397
## scale(I(viol3yr/acs_poptot))  0.2235749  0.0379480  5.8916 3.825e-09
## scale(dissim)                 0.0758260  0.0476155  1.5925    0.1113
## scale(ppop65plus)             0.7019724  0.0376957 18.6221 < 2.2e-16
## 
## Rho: 0.066138, LR test value: 2.1119, p-value: 0.14616
## Asymptotic standard error: 0.043789
##     z-value: 1.5104, p-value: 0.13095
## Wald statistic: 2.2813, p-value: 0.13095
## 
## Log likelihood: -155.394 for lag model
## ML residual variance (sigma squared): 0.22064, (sigma: 0.46973)
## Nagelkerke pseudo-R-squared: 0.77808 
## Number of observations: 234 
## Number of parameters estimated: 7 
## AIC: 324.79, (AIC for lm: 324.9)
## LM test for residual autocorrelation
## test value: 9.0458, p-value: 0.002633
bptest.sarlm(fit.lag)
## 
##  studentized Breusch-Pagan test
## 
## data:  
## BP = 53.263, df = 4, p-value = 7.506e-11
#robust SE's for the spatial model
library(sandwich)
lm.target <- lm(fit.lag$tary ~ fit.lag$tarX - 1)
coeftest(lm.target, vcov.=vcovHC(lm.target, type="HC0"))
## 
## t test of coefficients:
## 
##                                             Estimate Std. Error t value
## fit.lag$tarXx(Intercept)                  -0.0035113  0.0307070 -0.1143
## fit.lag$tarXxscale(ppersonspo)             0.1047111  0.0790642  1.3244
## fit.lag$tarXxscale(I(viol3yr/acs_poptot))  0.2235749  0.1140534  1.9603
## fit.lag$tarXxscale(dissim)                 0.0758260  0.0498978  1.5196
## fit.lag$tarXxscale(ppop65plus)             0.7019724  0.0617274 11.3721
##                                           Pr(>|t|)    
## fit.lag$tarXx(Intercept)                   0.90906    
## fit.lag$tarXxscale(ppersonspo)             0.18670    
## fit.lag$tarXxscale(I(viol3yr/acs_poptot))  0.05118 .  
## fit.lag$tarXxscale(dissim)                 0.12998    
## fit.lag$tarXxscale(ppop65plus)             < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Next we fit the spatial error model
fit.err<-errorsarlm(scale(mortrate)~scale(ppersonspo)+scale(I(viol3yr/acs_poptot))+scale(dissim)+scale(ppop65plus), data=dat, listw=sa.wt2)
summary(fit.err, Nagelkerke=T)
## 
## Call:
## errorsarlm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(I(viol3yr/acs_poptot)) + 
##     scale(dissim) + scale(ppop65plus), data = dat, listw = sa.wt2)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.9126220 -0.2695111 -0.0030349  0.2680765  2.1121587 
## 
## Type: error 
## Coefficients: (asymptotic standard errors) 
##                               Estimate Std. Error z value  Pr(>|z|)
## (Intercept)                  0.0029538  0.0277141  0.1066   0.91512
## scale(ppersonspo)            0.1202146  0.0484444  2.4815   0.01308
## scale(I(viol3yr/acs_poptot)) 0.2303560  0.0371271  6.2045 5.486e-10
## scale(dissim)                0.0870377  0.0458697  1.8975   0.05776
## scale(ppop65plus)            0.7350826  0.0340110 21.6131 < 2.2e-16
## 
## Lambda: -0.10644, LR test value: 2.2391, p-value: 0.13456
## Asymptotic standard error: 0.071547
##     z-value: -1.4877, p-value: 0.13684
## Wald statistic: 2.2132, p-value: 0.13684
## 
## Log likelihood: -155.3304 for error model
## ML residual variance (sigma squared): 0.22001, (sigma: 0.46905)
## Nagelkerke pseudo-R-squared: 0.7782 
## Number of observations: 234 
## Number of parameters estimated: 7 
## AIC: 324.66, (AIC for lm: 324.9)
#As a pretty good indicator of which model is best, look at the AIC of each
AIC(fit.1)
## [1] 324.8999
AIC(fit.lag)
## [1] 324.788
AIC(fit.err)
## [1] 324.6608

3.0.2 Larger data example US counties

This example shows a lot more in terms of spatial effects.

spdat<-readShapePoly("~/Google Drive/a&m_stuff//usdata_mort.shp")
#Create a good representative set of neighbor types
us.nb6<-knearneigh(coordinates(spdat), k=6)
us.nb6<-knn2nb(us.nb6)
us.wt6<-nb2listw(us.nb6, style="W")

us.nb5<-knearneigh(coordinates(spdat), k=5)
us.nb5<-knn2nb(us.nb5)
us.wt5<-nb2listw(us.nb5, style="W")

us.nb4<-knearneigh(coordinates(spdat), k=4)
us.nb4<-knn2nb(us.nb4)
us.wt4<-nb2listw(us.nb4, style="W")

us.nb3<-knearneigh(coordinates(spdat), k=3)
us.nb3<-knn2nb(us.nb3)
us.wt3<-nb2listw(us.nb3,style="W")

us.nb2<-knearneigh(coordinates(spdat), k=2)
us.nb2<-knn2nb(us.nb2)
us.wt2<-nb2listw(us.nb2,style="W")

us.nbr<-poly2nb(spdat, queen=F)
us.wtr<-nb2listw(us.nbr, zero.policy=T)

us.nbq<-poly2nb(spdat, queen=T)
us.wtq<-nb2listw(us.nbr, style="W", zero.policy=T)

hist(spdat$mortrate)

spplot(spdat,"mortrate", at=quantile(spdat$mortrate), col.regions=brewer.pal(n=5, "Reds"), main="Spatial Distribution of US Mortality Rate")

#do some basic regression models, without spatial structure
fit.1.us<-lm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+factor(RUCC), spdat)
summary(fit.1.us)
## 
## Call:
## lm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + factor(RUCC), data = spdat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5810 -0.4280  0.0216  0.4534  4.2606 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       -0.16408    0.06218  -2.639  0.00836 ** 
## scale(ppersonspo)  0.60632    0.01789  33.893  < 2e-16 ***
## scale(p65plus)    -0.04085    0.01582  -2.582  0.00987 ** 
## scale(pblack_1)    0.10730    0.01641   6.538 7.29e-11 ***
## scale(phisp)      -0.28734    0.01484 -19.358  < 2e-16 ***
## factor(RUCC)1      0.41534    0.08731   4.757 2.05e-06 ***
## factor(RUCC)2      0.29579    0.07226   4.094 4.36e-05 ***
## factor(RUCC)3      0.11800    0.07985   1.478  0.13955    
## factor(RUCC)4      0.23900    0.08845   2.702  0.00693 ** 
## factor(RUCC)5      0.13588    0.09502   1.430  0.15282    
## factor(RUCC)6      0.41615    0.06901   6.030 1.83e-09 ***
## factor(RUCC)7      0.17107    0.07097   2.411  0.01599 *  
## factor(RUCC)8      0.11620    0.07880   1.475  0.14040    
## factor(RUCC)9     -0.20337    0.07654  -2.657  0.00793 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.7412 on 3053 degrees of freedom
## Multiple R-squared:  0.453,  Adjusted R-squared:  0.4507 
## F-statistic: 194.5 on 13 and 3053 DF,  p-value: < 2.2e-16
vif(fit.1.us)
##                       GVIF Df GVIF^(1/(2*Df))
## scale(ppersonspo) 1.786082  1        1.336444
## scale(p65plus)    1.397321  1        1.182083
## scale(pblack_1)   1.503301  1        1.226092
## scale(phisp)      1.229648  1        1.108895
## factor(RUCC)      1.724759  9        1.030746
par(mfrow=c(2,2))
plot(fit.1.us)

par(mfrow=c(1,1))
#this is a test for constant variance
bptest(fit.1.us)  #looks like have heteroskedasticity
## 
##  studentized Breusch-Pagan test
## 
## data:  fit.1.us
## BP = 249.87, df = 13, p-value < 2.2e-16
#extract studentized residuals from the fit, and examine them
spdat$residfit1<-rstudent(fit.1.us)

cols<-brewer.pal(5,"RdBu")
spplot(spdat,"residfit1", at=quantile(spdat$residfit1), col.regions=cols, main="Residuals from Model fit of US Mortality Rate")

#test for residual autocorrelation 
resi<-c(lm.morantest(fit.1.us, listw=us.wt2)$estimate[1],
        lm.morantest(fit.1.us, listw=us.wt3)$estimate[1],
        lm.morantest(fit.1.us, listw=us.wt4)$estimate[1],
        lm.morantest(fit.1.us, listw=us.wt5)$estimate[1],
        lm.morantest(fit.1.us, listw=us.wt6)$estimate[1],
        lm.morantest(fit.1.us, listw=us.wtq,zero.policy=T)$estimate[1],
        lm.morantest(fit.1.us, listw=us.wtr,zero.policy=T)$estimate[1])
plot(resi, type="l")

lm.morantest(fit.1.us, listw=us.wt2)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + factor(RUCC),
## data = spdat)
## weights: us.wt2
## 
## Moran I statistic standard deviate = 23.756, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     0.3965422192    -0.0018662504     0.0002812589
lm.morantest(fit.1.us, listw=us.wt3)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + factor(RUCC),
## data = spdat)
## weights: us.wt3
## 
## Moran I statistic standard deviate = 27.862, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     0.3879364703    -0.0018143292     0.0001956793
lm.morantest(fit.1.us, listw=us.wt4)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + factor(RUCC),
## data = spdat)
## weights: us.wt4
## 
## Moran I statistic standard deviate = 31.107, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     0.3798405687    -0.0017734270     0.0001504971
lm.morantest(fit.1.us, listw=us.wt5)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + factor(RUCC),
## data = spdat)
## weights: us.wt5
## 
## Moran I statistic standard deviate = 33.076, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     0.3625191247    -0.0017199894     0.0001212718
lm.morantest(fit.1.us, listw=us.wt6)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + factor(RUCC),
## data = spdat)
## weights: us.wt6
## 
## Moran I statistic standard deviate = 35.916, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     0.3600763582    -0.0016700396     0.0001014426
lm.morantest(fit.1.us, listw=us.wtq, zero.policy=T)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + factor(RUCC),
## data = spdat)
## weights: us.wtq
## 
## Moran I statistic standard deviate = 32.728, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     0.3693002497    -0.0016900532     0.0001284917
lm.morantest(fit.1.us, listw=us.wtr, zero.policy=T)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + factor(RUCC),
## data = spdat)
## weights: us.wtr
## 
## Moran I statistic standard deviate = 32.728, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     0.3693002497    -0.0016900532     0.0001284917
#Now we fit the spatial lag model 
#The lag mode is fit with lagsarlm() in the spdep library
#we basically specify the same model as in the lm() fit above
#But we need to specify the spatial weight matrix and the type
#of lag model to fit

fit.lag.us<-lagsarlm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+factor(RUCC), spdat, listw=us.wt2, type="lag")
summary(fit.lag.us, Nagelkerke=T)
## 
## Call:
## lagsarlm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + factor(RUCC), data = spdat, 
##     listw = us.wt2, type = "lag")
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -3.435066 -0.365396  0.016386  0.379149  4.181996 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                     Estimate Std. Error  z value  Pr(>|z|)
## (Intercept)       -0.0601089  0.0539296  -1.1146 0.2650300
## scale(ppersonspo)  0.4383725  0.0166361  26.3507 < 2.2e-16
## scale(p65plus)    -0.0067939  0.0137105  -0.4955 0.6202305
## scale(pblack_1)    0.0555812  0.0144317   3.8513 0.0001175
## scale(phisp)      -0.1891732  0.0132557 -14.2711 < 2.2e-16
## factor(RUCC)1      0.2866229  0.0756754   3.7875 0.0001522
## factor(RUCC)2      0.1402669  0.0627174   2.2365 0.0253196
## factor(RUCC)3      0.0146097  0.0692250   0.2110 0.8328515
## factor(RUCC)4      0.1496221  0.0766327   1.9525 0.0508838
## factor(RUCC)5      0.0759998  0.0823307   0.9231 0.3559531
## factor(RUCC)6      0.2510359  0.0599005   4.1909 2.779e-05
## factor(RUCC)7      0.0760537  0.0615150   1.2363 0.2163308
## factor(RUCC)8      0.0063996  0.0682837   0.0937 0.9253308
## factor(RUCC)9     -0.2506134  0.0662793  -3.7812 0.0001561
## 
## Rho: 0.39892, LR test value: 679.88, p-value: < 2.22e-16
## Asymptotic standard error: 0.014734
##     z-value: 27.075, p-value: < 2.22e-16
## Wald statistic: 733.08, p-value: < 2.22e-16
## 
## Log likelihood: -3086.33 for lag model
## ML residual variance (sigma squared): 0.4118, (sigma: 0.64172)
## Nagelkerke pseudo-R-squared: 0.56174 
## Number of observations: 3067 
## Number of parameters estimated: 16 
## AIC: 6204.7, (AIC for lm: 6882.5)
## LM test for residual autocorrelation
## test value: 33.636, p-value: 6.6459e-09
bptest.sarlm(fit.lag.us)
## 
##  studentized Breusch-Pagan test
## 
## data:  
## BP = 238.13, df = 13, p-value < 2.2e-16
#Robust SE's
lm.target.us <- lm(fit.lag.us$tary ~ fit.lag.us$tarX - 1)
coeftest(lm.target.us, vcov.=vcovHC(lm.target.us, type="HC0"))
## 
## t test of coefficients:
## 
##                                     Estimate Std. Error  t value  Pr(>|t|)
## fit.lag.us$tarXx(Intercept)       -0.0601089  0.0477044  -1.2600 0.2077555
## fit.lag.us$tarXxscale(ppersonspo)  0.4383725  0.0250784  17.4801 < 2.2e-16
## fit.lag.us$tarXxscale(p65plus)    -0.0067939  0.0176425  -0.3851 0.7002021
## fit.lag.us$tarXxscale(pblack_1)    0.0555812  0.0186951   2.9730 0.0029718
## fit.lag.us$tarXxscale(phisp)      -0.1891732  0.0168362 -11.2361 < 2.2e-16
## fit.lag.us$tarXxfactor(RUCC)1      0.2866229  0.0597309   4.7986 1.675e-06
## fit.lag.us$tarXxfactor(RUCC)2      0.1402669  0.0501894   2.7948 0.0052264
## fit.lag.us$tarXxfactor(RUCC)3      0.0146097  0.0593415   0.2462 0.8055470
## fit.lag.us$tarXxfactor(RUCC)4      0.1496221  0.0651111   2.2980 0.0216319
## fit.lag.us$tarXxfactor(RUCC)5      0.0759998  0.0803115   0.9463 0.3440640
## fit.lag.us$tarXxfactor(RUCC)6      0.2510359  0.0526992   4.7636 1.991e-06
## fit.lag.us$tarXxfactor(RUCC)7      0.0760537  0.0576485   1.3193 0.1871791
## fit.lag.us$tarXxfactor(RUCC)8      0.0063996  0.0699315   0.0915 0.9270913
## fit.lag.us$tarXxfactor(RUCC)9     -0.2506134  0.0664019  -3.7742 0.0001636
##                                      
## fit.lag.us$tarXx(Intercept)          
## fit.lag.us$tarXxscale(ppersonspo) ***
## fit.lag.us$tarXxscale(p65plus)       
## fit.lag.us$tarXxscale(pblack_1)   ** 
## fit.lag.us$tarXxscale(phisp)      ***
## fit.lag.us$tarXxfactor(RUCC)1     ***
## fit.lag.us$tarXxfactor(RUCC)2     ** 
## fit.lag.us$tarXxfactor(RUCC)3        
## fit.lag.us$tarXxfactor(RUCC)4     *  
## fit.lag.us$tarXxfactor(RUCC)5        
## fit.lag.us$tarXxfactor(RUCC)6     ***
## fit.lag.us$tarXxfactor(RUCC)7        
## fit.lag.us$tarXxfactor(RUCC)8        
## fit.lag.us$tarXxfactor(RUCC)9     ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Next we fit the spatial error model
fit.err.us<-errorsarlm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+factor(RUCC), spdat, listw=us.wt2)
summary(fit.err.us, Nagelkerke=T)
## 
## Call:
## errorsarlm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + factor(RUCC), data = spdat, 
##     listw = us.wt2)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -3.246243 -0.363446  0.018519  0.378347  4.269437 
## 
## Type: error 
## Coefficients: (asymptotic standard errors) 
##                     Estimate Std. Error  z value  Pr(>|z|)
## (Intercept)       -0.1519258  0.0718058  -2.1158 0.0343630
## scale(ppersonspo)  0.4956221  0.0192362  25.7651 < 2.2e-16
## scale(p65plus)    -0.0050401  0.0163231  -0.3088 0.7574975
## scale(pblack_1)    0.1454472  0.0200302   7.2614 3.830e-13
## scale(phisp)      -0.2201678  0.0193200 -11.3959 < 2.2e-16
## factor(RUCC)1      0.2983426  0.0802941   3.7156 0.0002027
## factor(RUCC)2      0.1978474  0.0802418   2.4656 0.0136768
## factor(RUCC)3      0.1136176  0.0837459   1.3567 0.1748783
## factor(RUCC)4      0.2271767  0.0877477   2.5890 0.0096261
## factor(RUCC)5      0.2166439  0.0959520   2.2578 0.0239559
## factor(RUCC)6      0.3282910  0.0763724   4.2986 1.719e-05
## factor(RUCC)7      0.1870157  0.0796173   2.3489 0.0188272
## factor(RUCC)8      0.0798636  0.0828415   0.9641 0.3350190
## factor(RUCC)9     -0.1286435  0.0843255  -1.5256 0.1271197
## 
## Lambda: 0.43168, LR test value: 582.46, p-value: < 2.22e-16
## Asymptotic standard error: 0.016177
##     z-value: 26.685, p-value: < 2.22e-16
## Wald statistic: 712.11, p-value: < 2.22e-16
## 
## Log likelihood: -3135.038 for error model
## ML residual variance (sigma squared): 0.42027, (sigma: 0.64829)
## Nagelkerke pseudo-R-squared: 0.5476 
## Number of observations: 3067 
## Number of parameters estimated: 16 
## AIC: 6302.1, (AIC for lm: 6882.5)
#As a pretty good indicator of which model is best, look at the AIC of each
AIC(fit.1)
## [1] 324.8999
AIC(fit.lag.us)
## [1] 6204.659
AIC(fit.err.us)
## [1] 6302.075

4 Spatial Regression Models

This lecture builds off the previous lecture on the Spatially Autoregressive Model (SAR) with either a lag or error specification. The lag model is written: \(Y= \rho W Y + X '\beta +e\)

Where Y is the dependent variable, X is the matrix of independent variables, \(\beta\) is the vector of regression parameters to be estimated from the data, \(\rho\) is the autoregressive coefficient, which tells us how strong the resemblance is, on average, between \(Y_i\) and it’s neighbors. The matrix W is the spatial weight matrix, describing the spatial network structure of the observations, like we described in the ESDA lecture.

In the lag model, we are specifying the spatial component on the dependent variable. This leads to a spatial filtering of the variable, where they are averaged over the surrounding neighborhood defined in W, called the spatially lagged variable. In R we use the spdep package, and the lagsarlm() function to fit this model.

The error model says that the autocorrelation is not in the outcome itself, but instead, any autocorrelation is attributable to there being missing spatial covariates in the data. If these spatially patterned covariates could be measures, the tne autocorrelation would be 0. This model is written:

\(Y= X' \beta +e\)

\(e=\lambda W e + v\)

This model, in effect, controls for the nuisance of correlated errors in the data that are attributable to an inherently spatial process, or to spatial autocorrelation in the measurement errors of the measured and possibly unmeasured variables in the model. This model is estimated in R using errorsarlm() in the spdep library.

4.1 Examination of Model Specification

To some degree, both of the SAR specifications allow us to model spatial dependence in the data. The primary difference between them is where we model said dependence.

The lag model says that the dependence affects the dependent variable only, we can liken this to a diffusion scenario, where your neighbors have a diffusive effect on you.

The error model says that dependence affects the residuals only. We can liken this to the missing spatially dependent covariate situation, where, if only we could measure another really important spatially associated predictor, we could account for the spatial dependence. But alas, we cannot, and we instead model dependence in our errors.

These are inherently two completely different ways to think about specifying a model, and we should really make our decision based upon how we think our process of interest operates.

That being said, this way of thinking isn’t necessarily popular among practitioners. Most practitioners want the best fitting model, ‘nuff said. So methods have been developed that test for alternate model specifications, to see which kind of model best summarizes the observed variation in the dependent variable and the spatial dependence.

4.2 More exotic types of spatial dependence

Spatial Durbin Model Another form of a spatial lag model is the Spatial Durbin Model (SDM). This model is an extension of the ordinary lag or error model that includes spatially lagged independent variables. If you remember, one issue that commonly occures with the lag model, is that we often have residual autocorrelation in the model. This autocorrelation could be attributable to a missing spatial covariate. We can get a kind of spatial covariate by lagging the predictor variables in the model using W. This model can be written:

\(Y= \rho W Y + X '\beta + W X \theta + e\)

Where, the \(\theta\) parameter vector are now the regression coefficients for the lagged predictor variables. We can also include the lagged predictors in an error model, which gives us the Durbin Error Model (DEM):

\(Y= X '\beta + W X \theta + e\)

\(e=\lambda W e + v\)

Generally, the spatial Durbin model is preferred to the ordinary error model, because we can include the “unspecified” spatial covariates from the error model into the Durbin model via the lagged predictor variables.

Spatially Autoregressive Moving Average Model Futher extensions of these models include dependence on both the outcome and the error process. Two models are described in LeSage and Pace. The Spatial Autocorrelation Model, or SAC model and the Spatially autoregressive moving average model (SARMA model). The SAC model is:

\(Y= \rho W_1 Y + X '\beta + e\)

\(e=\theta W_2 e + v\)

\(Y= (I_n - \rho W_1)^{-1} X '\beta + (I_n - \rho W_1)^{-1} (I_n - \theta W_2)^{-1} e\)

Where, you can potentially have two different spatial weight matrices, \(W_1\) and \(W_2\). Here, the lagged error term is taken over all orders of neighbors, leading to a more global error process, while the SARMA model has form:

\(Y= \rho W_1 Y + X '\beta + u\)

\(u=(I_n - \theta W_2) e\)

\(e \sim N(0, \sigma^2 I_n)\)

\(Y= (I_n - \rho W_1)^{-1} X '\beta + (I_n - \rho W_1)^{-1} (I_n - \theta W_2) e\)

which gives a “locally” weighted moving average to the residuals, which will avereage the residuals only in the local neighborhood, instead of over all neighbor orders.

Fitting these models in R can be done in the spdep library.

#Create a k=4 nearest neighbor set
us.nb4<-knearneigh(coordinates(spdat), k=4)
us.nb4<-knn2nb(us.nb4)
us.wt4<-nb2listw(us.nb4, style="W")



samp<-sample(1:dim(spdat@data)[1],size = 500, replace = F)
spdat$newmort<-spdat$mortrate
spdat$newmort[samp]<-NA

nbs<-poly2nb(spdat, queen = T)
nbs
## Neighbour list object:
## Number of regions: 3067 
## Number of nonzero links: 17478 
## Percentage nonzero weights: 0.1858079 
## Average number of links: 5.698728 
## 36 regions with no links:
## 175 192 287 291 308 325 427 938 1043 1050 1061 1080 1084 1125 1131 1149 1176 1179 1183 1254 1664 1715 1724 2141 2180 2259 2290 2329 2405 2647 2650 2659 2709 2775 2788 2811
us.lw<-nb2listw(nbs, zero.policy = T)

moran.test(spdat$mortrate,listw = us.lw , zero.policy = T)
## 
##  Moran I test under randomisation
## 
## data:  spdat$mortrate  
## weights: us.lw  
## 
## Moran I statistic standard deviate = 52.275, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      0.5823858939     -0.0003300330      0.0001242565
moran.test(spdat$mortrate,listw = us.lw , zero.policy = T)
## 
##  Moran I test under randomisation
## 
## data:  spdat$mortrate  
## weights: us.lw  
## 
## Moran I statistic standard deviate = 52.275, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Moran I statistic       Expectation          Variance 
##      0.5823858939     -0.0003300330      0.0001242565
hist(spdat$mortrate)

spplot(spdat,"mortrate", at=quantile(spdat$mortrate),col=NA, col.regions=brewer.pal(n=5, "Reds"), main="Spatial Distribution of US Mortality Rate")

fit.1.us<-lm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat)
summary(fit.1.us)
## 
## Call:
## lm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.8305 -0.4275  0.0283  0.4764  4.3289 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.11679    0.01987   5.877 4.63e-09 ***
## scale(ppersonspo)  0.60711    0.01705  35.615  < 2e-16 ***
## scale(p65plus)    -0.04993    0.01521  -3.283  0.00104 ** 
## scale(pblack_1)    0.11096    0.01637   6.780 1.44e-11 ***
## scale(phisp)      -0.28913    0.01496 -19.327  < 2e-16 ***
## I(RUCC >= 7)TRUE  -0.25367    0.03133  -8.096 8.09e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.757 on 3061 degrees of freedom
## Multiple R-squared:  0.4279, Adjusted R-squared:  0.427 
## F-statistic: 457.9 on 5 and 3061 DF,  p-value: < 2.2e-16
lm.morantest(fit.1.us, listw=us.wt4)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 7),
## data = spdat)
## weights: us.wt4
## 
## Moran I statistic standard deviate = 32.692, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##      0.399924558     -0.001324880      0.000150646
#SAR - Lag model
fit.lag<-lagsarlm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, listw=us.wt4, type="lag", method="MC")
summary(fit.lag, Nagelkerke=T)
## 
## Call:
## lagsarlm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     listw = us.wt4, type = "lag", method = "MC")
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -3.466238 -0.344042  0.018551  0.372959  4.207904 
## 
## Type: lag 
## Coefficients: (numerical Hessian approximate standard errors) 
##                    Estimate Std. Error  z value  Pr(>|z|)
## (Intercept)        0.089966   0.016541   5.4389 5.362e-08
## scale(ppersonspo)  0.388833   0.015555  24.9965 < 2.2e-16
## scale(p65plus)    -0.011438   0.012382  -0.9238  0.355609
## scale(pblack_1)    0.039492   0.013775   2.8670  0.004144
## scale(phisp)      -0.159120   0.013027 -12.2150 < 2.2e-16
## I(RUCC >= 7)TRUE  -0.193853   0.026057  -7.4396 1.010e-13
## 
## Rho: 0.52125, LR test value: 901.77, p-value: < 2.22e-16
## Approximate (numerical Hessian) standard error: 0.015156
##     z-value: 34.393, p-value: < 2.22e-16
## Wald statistic: 1182.9, p-value: < 2.22e-16
## 
## Log likelihood: -3044.072 for lag model
## ML residual variance (sigma squared): 0.3983, (sigma: 0.63111)
## Nagelkerke pseudo-R-squared: 0.57365 
## Number of observations: 3067 
## Number of parameters estimated: 8 
## AIC: 6104.1, (AIC for lm: 7003.9)
#SAR - Error model
fit.err<-errorsarlm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, listw=us.wt4, etype="error", method="MC")
summary(fit.err, Nagelkerke=T)
## 
## Call:
## errorsarlm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     listw = us.wt4, etype = "error", method = "MC")
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -3.209704 -0.346452  0.012809  0.374794  4.328396 
## 
## Type: error 
## Coefficients: (asymptotic standard errors) 
##                     Estimate Std. Error z value  Pr(>|z|)
## (Intercept)        0.0726183  0.0309916  2.3432   0.01912
## scale(ppersonspo)  0.4711012  0.0191331 24.6223 < 2.2e-16
## scale(p65plus)     0.0069189  0.0158285  0.4371   0.66203
## scale(pblack_1)    0.1241378  0.0219065  5.6667 1.456e-08
## scale(phisp)      -0.1749111  0.0216635 -8.0740 6.661e-16
## I(RUCC >= 7)TRUE  -0.1569379  0.0298030 -5.2658 1.395e-07
## 
## Lambda: 0.59315, LR test value: 862.43, p-value: < 2.22e-16
## Approximate (numerical Hessian) standard error: 0.016931
##     z-value: 35.034, p-value: < 2.22e-16
## Wald statistic: 1227.4, p-value: < 2.22e-16
## 
## Log likelihood: -3063.745 for error model
## ML residual variance (sigma squared): 0.39401, (sigma: 0.6277)
## Nagelkerke pseudo-R-squared: 0.56815 
## Number of observations: 3067 
## Number of parameters estimated: 8 
## AIC: 6143.5, (AIC for lm: 7003.9)
#Spatial Durbin Model
fit.durb<-lagsarlm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, listw=us.wt4, type="mixed", method="MC")
summary(fit.durb, Nagelkerke=T)
## 
## Call:
## lagsarlm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     listw = us.wt4, type = "mixed", method = "MC")
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -3.447178 -0.342792  0.012991  0.367467  4.211775 
## 
## Type: mixed 
## Coefficients: (numerical Hessian approximate standard errors) 
##                        Estimate Std. Error z value  Pr(>|z|)
## (Intercept)            0.091411   0.021658  4.2207 2.436e-05
## scale(ppersonspo)      0.444740   0.020665 21.5218 < 2.2e-16
## scale(p65plus)         0.040861   0.016686  2.4489 0.0143312
## scale(pblack_1)        0.075392   0.026577  2.8368 0.0045575
## scale(phisp)          -0.051968   0.028221 -1.8415 0.0655541
## I(RUCC >= 7)TRUE      -0.149849   0.029966 -5.0007 5.712e-07
## lag.scale(ppersonspo) -0.110889   0.028417 -3.9022 9.534e-05
## lag.scale(p65plus)    -0.086672   0.022904 -3.7841 0.0001543
## lag.scale(pblack_1)   -0.056853   0.030005 -1.8948 0.0581162
## lag.scale(phisp)      -0.117467   0.031439 -3.7364 0.0001867
## lag.I(RUCC >= 7)TRUE  -0.048864   0.044962 -1.0868 0.2771334
## 
## Rho: 0.55495, LR test value: 814.43, p-value: < 2.22e-16
## Approximate (numerical Hessian) standard error: 0.016803
##     z-value: 33.027, p-value: < 2.22e-16
## Wald statistic: 1090.8, p-value: < 2.22e-16
## 
## Log likelihood: -3011.115 for mixed model
## ML residual variance (sigma squared): 0.38536, (sigma: 0.62077)
## Nagelkerke pseudo-R-squared: 0.58272 
## Number of observations: 3067 
## Number of parameters estimated: 13 
## AIC: 6048.2, (AIC for lm: 6860.7)
#Spatial Durbin Error Model
fit.errdurb<-errorsarlm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, listw=us.wt4, etype="emixed", method="MC")
summary(fit.errdurb, Nagelkerke=T)
## 
## Call:
## errorsarlm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     listw = us.wt4, etype = "emixed", method = "MC")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.3946501 -0.3422770  0.0083312  0.3680210  4.2256377 
## 
## Type: error 
## Coefficients: (asymptotic standard errors) 
##                         Estimate Std. Error z value  Pr(>|z|)
## (Intercept)            0.1414754  0.0405295  3.4907 0.0004818
## scale(ppersonspo)      0.4793960  0.0194434 24.6559 < 2.2e-16
## scale(p65plus)         0.0214846  0.0160602  1.3378 0.1809759
## scale(pblack_1)        0.0839953  0.0238672  3.5193 0.0004327
## scale(phisp)          -0.0976088  0.0251959 -3.8740 0.0001071
## I(RUCC >= 7)TRUE      -0.1706071  0.0301336 -5.6617 1.499e-08
## lag.scale(ppersonspo)  0.1717194  0.0315918  5.4356 5.462e-08
## lag.scale(p65plus)    -0.1017092  0.0290363 -3.5028 0.0004603
## lag.scale(pblack_1)   -0.0068911  0.0331884 -0.2076 0.8355142
## lag.scale(phisp)      -0.2352339  0.0332301 -7.0789 1.453e-12
## lag.I(RUCC >= 7)TRUE  -0.1389307  0.0574358 -2.4189 0.0155681
## 
## Lambda: 0.55893, LR test value: 791.07, p-value: < 2.22e-16
## Approximate (numerical Hessian) standard error: 0.01721
##     z-value: 32.477, p-value: < 2.22e-16
## Wald statistic: 1054.7, p-value: < 2.22e-16
## 
## Log likelihood: -3022.794 for error model
## ML residual variance (sigma squared): 0.38841, (sigma: 0.62322)
## Nagelkerke pseudo-R-squared: 0.57953 
## Number of observations: 3067 
## Number of parameters estimated: 13 
## AIC: 6071.6, (AIC for lm: 6860.7)
#SAC Model
fit.sac<-sacsarlm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, listw=us.wt4, type="sac", method="MC")
summary(fit.sac, Nagelkerke=T)
## 
## Call:
## sacsarlm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     listw = us.wt4, type = "sac", method = "MC")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.7890271 -0.3033368  0.0020529  0.3217538  3.8263528 
## 
## Type: sac 
## Coefficients: (numerical Hessian approximate standard errors) 
##                    Estimate Std. Error z value  Pr(>|z|)
## (Intercept)        0.056506   0.077491  0.7292 0.4658783
## scale(ppersonspo)  0.381989   0.019198 19.8971 < 2.2e-16
## scale(p65plus)     0.034018   0.014848  2.2911 0.0219605
## scale(pblack_1)    0.081491   0.023604  3.4525 0.0005555
## scale(phisp)      -0.076534   0.025106 -3.0485 0.0023000
## I(RUCC >= 7)TRUE  -0.101725   0.026732 -3.8054 0.0001416
## 
## Rho: -0.5452
## Approximate (numerical Hessian) standard error: 0.035148
##     z-value: -15.512, p-value: < 2.22e-16
## Lambda: 0.8671
## Approximate (numerical Hessian) standard error: 0.012209
##     z-value: 71.021, p-value: < 2.22e-16
## 
## LR test value: 966.19, p-value: < 2.22e-16
## 
## Log likelihood: -3011.862 for sac model
## ML residual variance (sigma squared): 0.30751, (sigma: 0.55454)
## Nagelkerke pseudo-R-squared: 0.58252 
## Number of observations: 3067 
## Number of parameters estimated: 9 
## AIC: 6041.7, (AIC for lm: 7003.9)
#SMA Model
fit.sma<-spautolm(scale(mortrate)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, listw=us.wt4, family="SMA")
summary(fit.sma)
## 
## Call: 
## spautolm(formula = scale(mortrate) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     listw = us.wt4, family = "SMA")
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -3.293104 -0.344431  0.016537  0.380445  4.329735 
## 
## Coefficients: 
##                    Estimate Std. Error  z value  Pr(>|z|)
## (Intercept)        0.085878   0.024072   3.5675 0.0003603
## scale(ppersonspo)  0.519189   0.018359  28.2795 < 2.2e-16
## scale(p65plus)    -0.014067   0.015643  -0.8992 0.3685331
## scale(pblack_1)    0.131516   0.019449   6.7619 1.362e-11
## scale(phisp)      -0.228134   0.018665 -12.2224 < 2.2e-16
## I(RUCC >= 7)TRUE  -0.186258   0.030287  -6.1497 7.763e-10
## 
## Lambda: 0.54914 LR test value: 645.76 p-value: < 2.22e-16 
## Numerical Hessian standard error of lambda: 0.021073 
## 
## Log likelihood: -3172.079 
## ML residual variance (sigma squared): 0.49357, (sigma: 0.70254)
## Number of observations: 3067 
## Number of parameters estimated: 8 
## AIC: 6360.2

4.3 Using the Lagrange Multiplier Test (LMT)

The so-called Lagrange Multiplier (econometrician’s jargon for a score test) test. These tests compare the model fits from the OLS, spatial error, and spatial lag models using the method of the score test.

For those who don’t remember, the score test is a test based on the relative change in the first derivative of the likelihood function around the maximum likelihood. The particular thing here that is affecting the value of this derivative is the autoregressive parameter, \(\rho\) or \(\lambda\). In the OLS model \(\rho\) or \(\lambda\) = 0 (so both the lag and error models simplify to OLS), but as this parameter changes, so does the likelihood for the model, hence why the derivative of the likelihood function is used. This is all related to how the estimation routines estimate the value of \(\rho\) or \(\lambda\).

In general, you fit the OLS model to your dependent variable, then submit the OLS model fit to the LMT testing procedure.

Then you look to see which model (spatial error, or spatial lag) has the highest value for the test.

Enter the uncertainty… So how much bigger, you might say?

Well, drastically bigger, if the LMT for the error model is 2500 and the LMT for the lag model is 2480, this is NOT A BIG DIFFERENCE, only about 1%. If you see a LMT for the error model of 2500 and a LMT for the lag model of 250, THIS IS A BIG DIFFERENCE.

So what if you don’t see a BIG DIFFERENCE, HOW DO YOU DECIDE WHICH MODEL TO USE???

Well, you could think more, but who has time for that.

The econometricians have thought up a “better” LMT test, the so-called robust LMT, robust to what I’m not sure, but it is said that it can settle such problems of a “not so big difference” between the lag and error model specifications.

So what do you do? In general, think about your problem before you run your analysis, should this fail you, proceed with using the LMT, if this is inconclusive, look at the robust LMT, and choose the model which has the larger value for this test.

Here’s how we do the Lagrange Multiplier test in R:

lm.LMtests(fit.1.us, listw=us.wt4, test="all")
## 
##  Lagrange multiplier diagnostics for spatial dependence
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 7),
## data = spdat)
## weights: us.wt4
## 
## LMerr = 1056.8, df = 1, p-value < 2.2e-16
## 
## 
##  Lagrange multiplier diagnostics for spatial dependence
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 7),
## data = spdat)
## weights: us.wt4
## 
## LMlag = 1084.9, df = 1, p-value < 2.2e-16
## 
## 
##  Lagrange multiplier diagnostics for spatial dependence
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 7),
## data = spdat)
## weights: us.wt4
## 
## RLMerr = 77.419, df = 1, p-value < 2.2e-16
## 
## 
##  Lagrange multiplier diagnostics for spatial dependence
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 7),
## data = spdat)
## weights: us.wt4
## 
## RLMlag = 105.56, df = 1, p-value < 2.2e-16
## 
## 
##  Lagrange multiplier diagnostics for spatial dependence
## 
## data:  
## model: lm(formula = scale(mortrate) ~ scale(ppersonspo) +
## scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 7),
## data = spdat)
## weights: us.wt4
## 
## SARMA = 1162.4, df = 2, p-value < 2.2e-16

There is a 2.66% difference the regular LM test between the error and lag models, but a 36.35% difference in the Robust LM tests. In this case, I would say that either the lag model looks like the best one, using the Robust Lagrange multiplier test, or possibly the SARMA model, since it’s test is 7.14% difference between it and the lag model. Unfortunately, there is no a robust test for SARMA model.

Of course, the AIC is also your friend:

AICs<-c(AIC(fit.1.us),AIC(fit.lag), AIC(fit.err), AIC(fit.durb), AIC(fit.errdurb), AIC(fit.sac), AIC(fit.sma))
plot(AICs, type="l", lwd=1.5, xaxt="n", xlab="")
axis(1, at=1:7,labels=F) #6= number of models
labels<-c("OLS", "Lag","Err", "Durbin","Err Durbin", "SAC", "SMA" )
text(1:7, par("usr")[3]-.25, srt=45, adj=1, labels=labels, xpd=T)
mtext(side=1, text="Model Specification", line=3)
symbols(x= which.min(AICs), y=AICs[which.min(AICs)], circles=1, fg=2,lwd=2,add=T)

knitr::kable(data.frame(Models=labels, AIC=round(AICs, 2)))
Models AIC
OLS 7003.92
Lag 6104.14
Err 6143.49
Durbin 6048.23
Err Durbin 6071.59
SAC 6041.72
SMA 6360.16

Which shows that the Spatial Durbin model best fits the data, although the degree of difference between it an the SAC model is small. A likelihood ratio test could be used:

anova(fit.sac, fit.durb)
##          Model df    AIC  logLik Test L.Ratio p-value
## fit.sac      1  9 6041.7 -3011.9    1                
## fit.durb     2 13 6048.2 -3011.1    2  1.4934 0.82781

Which indicates that the Durbin model fits significantly better than the SAC model. Durbin it is!!

4.4 Interpreting effects in spatial lag models

In spatial lag models, interpretation of the regression effects is complicated. Each observation will have a direct effect of its predictors, but each observation will also have in indirect effect of the information of its neighbors, although Spatial Error models do not have this issue. In OLS, the impact/effect of a predictor is straight forward: \(\frac {\delta y_i} {\delta x_{ik}} = \beta_k\) and \(\frac {\delta y_i} {\delta x_{jk}} = 0\), but when a model has a spatial lag of either the outcome or a predictor, this becomes more complicated, indeed: \(\frac {\delta y_i} {\delta x_{jk}}\) may not = 0, or \(\frac {\delta y_i} {\delta x_{jk}} = S_r(W)\) , where \(S_r(W) = (I_n - \rho W)^{-1} \beta_k\) This implies that a change in the ith region’s predictor can affect the jth region’s outcome * We have 2 situations: * \(S_r(W)_{ii}\), or the direct impact of an observation’s predictor on its own outcome, and: * \(S_r(W)_{ij}\), or the indirect impact of an observation’s neighbor’s predictor on its outcome.

This leads to three quantities that we want to know: * Average Direct Impact, which is similar to a traditional interpretation * Average Total impact, which would be the total of direct and indirect impacts of a predictor on one’s outcome * Average Indirect impact, which would be the average impact of one’s neighbors on one’s outcome

These quantities can be found using the impacts() function in the spdep library. We follow the example that converts the spatial weight matrix into a “sparse” matrix, and power it up using the trW() function. This follows the approximation methods described in Lesage and Pace, 2009. Here, we use Monte Carlo simulation to obtain simulated distributions of the various impacts. We are looking for the first part of the output and

W <- as(us.wt4, "CsparseMatrix")
trMC <- trW(W, type="MC")
im<-impacts(fit.durb, tr=trMC, R=100)
sums<-summary(im,  zstats=T)
data.frame(sums$res)
##                        direct    indirect       total
## scale(ppersonspo)  0.46680338  0.28334854  0.75015193
## scale(p65plus)     0.03047277 -0.13340822 -0.10293545
## scale(pblack_1)    0.07295465 -0.03129953  0.04165512
## scale(phisp)      -0.07571679 -0.30499685 -0.38071364
## I(RUCC >= 7)TRUE  -0.17127960 -0.27522245 -0.44650206
data.frame(sums$pzmat)
##                         Direct     Indirect        Total
## scale(ppersonspo) 0.000000e+00 8.881784e-16 0.000000e+00
## scale(p65plus)    5.391188e-02 1.935872e-03 2.034772e-02
## scale(pblack_1)   2.849162e-03 4.265916e-01 2.475250e-01
## scale(phisp)      3.026314e-03 2.220446e-16 0.000000e+00
## I(RUCC >= 7)TRUE  1.100832e-09 6.696851e-04 4.094315e-07

We see all variables have a significant direct effect, we also see that poverty, %65 and older, hispanic % and Rural classifications all have significant indirect impacts.

We can likewise see the effects by order of neighbors, similar to what Yang et al(2015) do in their Table 4.

Here, I do this up to 5th order neighbors.

im2<-impacts(fit.durb, tr=trMC, R=100, Q=5)
sums2<-summary(im2,  zstats=T, reportQ=T, short=T)
sums2
## Impact measures (mixed, trace):
##                        Direct    Indirect       Total
## scale(ppersonspo)  0.46680338  0.28334854  0.75015193
## scale(p65plus)     0.03047277 -0.13340822 -0.10293545
## scale(pblack_1)    0.07295465 -0.03129953  0.04165512
## scale(phisp)      -0.07571679 -0.30499685 -0.38071364
## I(RUCC >= 7)TRUE  -0.17127960 -0.27522245 -0.44650206
## =================================
## Impact components
## $direct
##    scale(ppersonspo) scale(p65plus) scale(pblack_1) scale(phisp)
## Q1       0.444740485   4.086059e-02    0.0753917837 -0.051968248
## Q2      -0.013179962  -1.030156e-02   -0.0067574457 -0.013961782
## Q3       0.027633990   1.365427e-03    0.0041006069 -0.005230081
## Q4       0.001869367  -1.150692e-03   -0.0003411090 -0.002473535
## Q5       0.003753210  -9.884998e-06    0.0004597465 -0.001043439
##    I(RUCC >= 7)TRUE
## Q1     -0.149849443
## Q2     -0.005807880
## Q3     -0.010633842
## Q4     -0.002120764
## Q5     -0.001664493
## 
## $indirect
##    scale(ppersonspo) scale(p65plus) scale(pblack_1) scale(phisp)
## Q1       -0.11088882   -0.086671536    -0.056853362  -0.11746661
## Q2        0.19845248   -0.015121439     0.017045428  -0.08006687
## Q3        0.07518385   -0.015474036     0.001608756  -0.04695155
## Q4        0.05518986   -0.006678945     0.003509546  -0.02648490
## Q5        0.02791207   -0.004335208     0.001298592  -0.01502718
##    I(RUCC >= 7)TRUE
## Q1      -0.04886425
## Q2      -0.10446920
## Q3      -0.05056493
## Q4      -0.03184178
## Q5      -0.01718318
## 
## $total
##    scale(ppersonspo) scale(p65plus) scale(pblack_1) scale(phisp)
## Q1        0.33385167   -0.045810948     0.018538422  -0.16943486
## Q2        0.18527252   -0.025422997     0.010287982  -0.09402865
## Q3        0.10281784   -0.014108609     0.005709363  -0.05218163
## Q4        0.05705923   -0.007829637     0.003168437  -0.02895844
## Q5        0.03166528   -0.004345093     0.001758339  -0.01607062
##    I(RUCC >= 7)TRUE
## Q1      -0.19871369
## Q2      -0.11027708
## Q3      -0.06119877
## Q4      -0.03396254
## Q5      -0.01884767
## 
## ========================================================
## Simulation results (numerical Hessian approximation variance matrix):
## ========================================================
## Simulated z-values:
##                      Direct   Indirect      Total
## scale(ppersonspo) 26.488216  6.5488251  18.684378
## scale(p65plus)     1.964213 -3.6304733  -2.468294
## scale(pblack_1)    2.762210 -0.5809559   1.335944
## scale(phisp)      -3.053773 -8.1272288 -12.423055
## I(RUCC >= 7)TRUE  -6.197291 -3.4949437  -5.223965
## 
## Simulated p-values:
##                   Direct     Indirect   Total     
## scale(ppersonspo) < 2.22e-16 5.7992e-11 < 2.22e-16
## scale(p65plus)    0.0495054  0.00028290 0.013576  
## scale(pblack_1)   0.0057412  0.56127018 0.181568  
## scale(phisp)      0.0022598  4.4409e-16 < 2.22e-16
## I(RUCC >= 7)TRUE  5.7443e-10 0.00047416 1.7513e-07
## ========================================================
## Simulated impact components z-values:
## $Direct
##    scale(ppersonspo) scale(p65plus) scale(pblack_1) scale(phisp)
## Q1         23.568425     2.65499361        2.620038    -1.966870
## Q2         -3.923632    -4.36718591       -1.580947    -3.859617
## Q3         16.217015     1.45481292        2.772959    -3.708162
## Q4          4.857377    -3.82034550       -0.747322    -6.958105
## Q5          9.121798    -0.05172918        2.826842    -5.464313
##    I(RUCC >= 7)TRUE
## Q1        -5.446860
## Q2        -1.046169
## Q3        -6.318269
## Q4        -3.144287
## Q5        -5.702520
## 
## $Indirect
##    scale(ppersonspo) scale(p65plus) scale(pblack_1) scale(phisp)
## Q1         -4.136515      -4.274263       -1.595002    -3.792087
## Q2         25.002299      -1.696323        2.457391   -11.905404
## Q3         13.819008      -2.987487        0.469257   -10.638076
## Q4         13.633843      -2.311719        1.589642    -9.673309
## Q5         10.163734      -2.633620        1.031422    -7.979387
##    I(RUCC >= 7)TRUE
## Q1        -1.052248
## Q2        -5.952385
## Q3        -4.592787
## Q4        -5.174291
## Q5        -4.619953
## 
## $Total
##    scale(ppersonspo) scale(p65plus) scale(pblack_1) scale(phisp)
## Q1          14.77653      -2.430522        1.344992   -11.860719
## Q2          18.37052      -2.461782        1.338155   -12.540319
## Q3          17.48378      -2.479181        1.328362   -11.518940
## Q4          13.48238      -2.482185        1.315742    -9.693437
## Q5          10.17624      -2.470942        1.300464    -7.984876
##    I(RUCC >= 7)TRUE
## Q1        -5.110864
## Q2        -5.221182
## Q3        -5.188565
## Q4        -5.021436
## Q5        -4.755516
## 
## 
## Simulated impact components p-values:
## $Direct
##    scale(ppersonspo) scale(p65plus) scale(pblack_1) scale(phisp)
## Q1 < 2.22e-16        0.00793099     0.0087920       0.04919822  
## Q2 8.7224e-05        1.2586e-05     0.1138902       0.00011356  
## Q3 < 2.22e-16        0.14572106     0.0055549       0.00020877  
## Q4 1.1895e-06        0.00013326     0.4548692       3.4488e-12  
## Q5 < 2.22e-16        0.95874449     0.0047009       4.6470e-08  
##    I(RUCC >= 7)TRUE
## Q1 5.1267e-08      
## Q2 0.2954830       
## Q3 2.6451e-10      
## Q4 0.0016649       
## Q5 1.1805e-08      
## 
## $Indirect
##    scale(ppersonspo) scale(p65plus) scale(pblack_1) scale(phisp)
## Q1 3.5262e-05        1.9177e-05     0.110712        0.00014939  
## Q2 < 2.22e-16        0.0898247      0.013995        < 2.22e-16  
## Q3 < 2.22e-16        0.0028128      0.638886        < 2.22e-16  
## Q4 < 2.22e-16        0.0207931      0.111916        < 2.22e-16  
## Q5 < 2.22e-16        0.0084480      0.302343        1.5543e-15  
##    I(RUCC >= 7)TRUE
## Q1 0.29269         
## Q2 2.6426e-09      
## Q3 4.3736e-06      
## Q4 2.2878e-07      
## Q5 3.8383e-06      
## 
## $Total
##    scale(ppersonspo) scale(p65plus) scale(pblack_1) scale(phisp)
## Q1 < 2.22e-16        0.015077       0.17863         < 2.22e-16  
## Q2 < 2.22e-16        0.013825       0.18085         < 2.22e-16  
## Q3 < 2.22e-16        0.013168       0.18406         < 2.22e-16  
## Q4 < 2.22e-16        0.013058       0.18826         < 2.22e-16  
## Q5 < 2.22e-16        0.013476       0.19344         1.3323e-15  
##    I(RUCC >= 7)TRUE
## Q1 3.2069e-07      
## Q2 1.7778e-07      
## Q3 2.1192e-07      
## Q4 5.1287e-07      
## Q5 1.9794e-06

So we see that, for instance, for the direct impact of poverty, .4446/.4667 = 95.26% of the effect is due to a county’s own influence on itself, while (-.013 + .0277 + .0019 + .0037)/.4667 = 4.35 % of the effect of poverty comes from other neighboring counties.

5 Geographically Weighted Regression

Generally, if we have a continuous outcome, we consider using the OLS model and when we have data collected over space, we have other assumptions too.

library(spdep)
library(maptools)
library(car)
library(lmtest)
library(spgwr)
library(classInt)
library(RColorBrewer)

Stationarity is a general term in statistics/math, generally it means something doesn’t change over time or space e.g. stationary population e.g. stationary time series In spatial statistics, we can stay stationarity = homogeneity of an effect, or, that a process works the same regardless of where you observe the process. In spatial statistics, the latter can be a weak assumption, and we can ask, does X affect Y differently at different geographic locations, or in terms of parameters: If we estimate in OLS \(\beta\) = .5, are there locations in the data where \(\beta\) != .5?

5.0.1 Approaches to non-stationarity

There are several approaches to dealing with this problem. Among them are: * Create zones of homogeneity and stratify * Allow parameters to vary constantly

The first technique is stratification, or spatial regimes. This is where we stratify the data by either a known fixed quality, such as the state in which a county is located, or perhaps some other socio-demographic or socio-economic variable. Technique 1 - Stratification

We then fit separate regression models for each region. These could be OLS or some other specification we have discussed.

First, we read in our data and make a spatial representation of it

spdat$population<-as.numeric(as.character(spdat$Population))
spdat$deaths<-as.numeric(as.character(spdat$Deaths))
spdat<-spdat[spdat$STATE%in%c("05","22","40","35", "48"),]
spdat$mig_group<-cut(spdat$p5yrinmig, breaks = quantile(spdat$p5yrinmig, p=c(0, .3, .6, 1)), include.lowest = T)
lvs<-levels(spdat$mig_group)
lvs
## [1] "[0.0622,0.18]" "(0.18,0.227]"  "(0.227,0.504]"
nbs<-knearneigh(coordinates(spdat), k=2)

nbsmg1<-knearneigh(coordinates(spdat[spdat$mig_group==lvs[1],]), k=2)
nbsmg2<-knearneigh(coordinates(spdat[spdat$mig_group==lvs[2],]), k=2)
nbsmg3<-knearneigh(coordinates(spdat[spdat$mig_group==lvs[3],]), k=2)

nbs<-knn2nb(nbs,sym=T)
nbs1<-knn2nb(nbsmg1, sym=T)
nbs2<-knn2nb(nbsmg2, sym=T)
nbs3<-knn2nb(nbsmg3, sym=T)

Here’s the basic OLS model:

fit.1<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat)
summary(fit.1)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0066877 -0.0005928  0.0000377  0.0006448  0.0031275 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.046e-02  7.256e-05 144.199  < 2e-16 ***
## scale(ppersonspo)  5.545e-04  7.066e-05   7.847 2.64e-14 ***
## scale(p65plus)     2.092e-03  6.286e-05  33.281  < 2e-16 ***
## scale(pblack_1)    9.032e-05  7.291e-05   1.239    0.216    
## scale(phisp)      -8.643e-04  7.401e-05 -11.677  < 2e-16 ***
## I(RUCC >= 7)TRUE  -1.620e-04  1.195e-04  -1.356    0.176    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001159 on 495 degrees of freedom
## Multiple R-squared:  0.8073, Adjusted R-squared:  0.8054 
## F-statistic: 414.8 on 5 and 495 DF,  p-value: < 2.2e-16

We could stratify by state:

#Examine basic regimes, using states as a nesting mechanism
spplot(spdat, "STATE")

fit.strat<-lm(I(deaths/population)~STATE/(scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7)), data=spdat)
summary(fit.strat)
## 
## Call:
## lm(formula = I(deaths/population) ~ STATE/(scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7)), data = spdat)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0044801 -0.0005148  0.0000290  0.0006616  0.0043913 
## 
## Coefficients:
##                             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                9.682e-03  6.809e-04  14.219  < 2e-16 ***
## STATE22                   -9.979e-04  2.059e-03  -0.485 0.628201    
## STATE35                   -1.388e-04  1.560e-03  -0.089 0.929152    
## STATE40                    1.940e-03  8.023e-04   2.418 0.015976 *  
## STATE48                    6.572e-04  6.893e-04   0.953 0.340858    
## STATE05:scale(ppersonspo)  6.526e-04  2.551e-04   2.558 0.010828 *  
## STATE22:scale(ppersonspo)  3.219e-04  2.586e-04   1.245 0.213715    
## STATE35:scale(ppersonspo)  5.528e-04  1.943e-04   2.845 0.004634 ** 
## STATE40:scale(ppersonspo)  8.200e-04  1.930e-04   4.249 2.59e-05 ***
## STATE48:scale(ppersonspo)  2.606e-04  1.112e-04   2.344 0.019477 *  
## STATE05:scale(p65plus)     1.953e-03  1.991e-04   9.808  < 2e-16 ***
## STATE22:scale(p65plus)     2.355e-03  2.776e-04   8.485 2.83e-16 ***
## STATE35:scale(p65plus)     1.575e-03  1.694e-04   9.299  < 2e-16 ***
## STATE40:scale(p65plus)     2.530e-03  1.855e-04  13.639  < 2e-16 ***
## STATE48:scale(p65plus)     2.228e-03  8.561e-05  26.025  < 2e-16 ***
## STATE05:scale(pblack_1)    2.350e-04  1.605e-04   1.464 0.143734    
## STATE22:scale(pblack_1)    9.331e-05  1.901e-04   0.491 0.623848    
## STATE35:scale(pblack_1)    6.666e-04  2.100e-03   0.317 0.751093    
## STATE40:scale(pblack_1)    1.879e-04  4.565e-04   0.412 0.680838    
## STATE48:scale(pblack_1)    4.049e-04  1.633e-04   2.480 0.013505 *  
## STATE05:scale(phisp)      -1.783e-03  9.267e-04  -1.925 0.054887 .  
## STATE22:scale(phisp)      -3.092e-03  2.549e-03  -1.213 0.225843    
## STATE35:scale(phisp)       1.852e-04  2.336e-04   0.793 0.428117    
## STATE40:scale(phisp)       1.186e-04  5.806e-04   0.204 0.838172    
## STATE48:scale(phisp)      -4.991e-04  1.334e-04  -3.741 0.000206 ***
## STATE05:I(RUCC >= 7)TRUE   2.247e-04  3.017e-04   0.745 0.456661    
## STATE22:I(RUCC >= 7)TRUE   3.250e-04  3.560e-04   0.913 0.361859    
## STATE35:I(RUCC >= 7)TRUE  -5.091e-04  4.390e-04  -1.160 0.246774    
## STATE40:I(RUCC >= 7)TRUE  -2.993e-04  2.947e-04  -1.016 0.310346    
## STATE48:I(RUCC >= 7)TRUE  -3.663e-04  1.629e-04  -2.249 0.024975 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001079 on 471 degrees of freedom
## Multiple R-squared:  0.8412, Adjusted R-squared:  0.8314 
## F-statistic: 86.01 on 29 and 471 DF,  p-value: < 2.2e-16
anova(fit.1, fit.strat)
## Analysis of Variance Table
## 
## Model 1: I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + scale(pblack_1) + 
##     scale(phisp) + I(RUCC >= 7)
## Model 2: I(deaths/population) ~ STATE/(scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7))
##   Res.Df        RSS Df  Sum of Sq     F    Pr(>F)    
## 1    495 0.00066533                                  
## 2    471 0.00054848 24 0.00011685 4.181 4.489e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Or, we could examine an alternative regime, based on a demographic characteristic, say migration:

spplot(spdat, "mig_group", col.regions=brewer.pal(n=3, "Greys"))

fit.strat2<-lm(I(deaths/population)~mig_group/(scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7)), data=spdat)
summary(fit.strat2)
## 
## Call:
## lm(formula = I(deaths/population) ~ mig_group/(scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7)), data = spdat)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0060718 -0.0005637  0.0000301  0.0006485  0.0033088 
## 
## Coefficients:
##                                            Estimate Std. Error t value
## (Intercept)                               1.052e-02  1.417e-04  74.268
## mig_group(0.18,0.227]                     2.396e-04  1.891e-04   1.267
## mig_group(0.227,0.504]                   -3.476e-04  1.900e-04  -1.830
## mig_group[0.0622,0.18]:scale(ppersonspo)  2.522e-04  1.135e-04   2.222
## mig_group(0.18,0.227]:scale(ppersonspo)   5.935e-04  1.557e-04   3.812
## mig_group(0.227,0.504]:scale(ppersonspo)  5.085e-04  1.198e-04   4.243
## mig_group[0.0622,0.18]:scale(p65plus)     1.738e-03  1.272e-04  13.661
## mig_group(0.18,0.227]:scale(p65plus)      2.175e-03  1.264e-04  17.204
## mig_group(0.227,0.504]:scale(p65plus)     2.118e-03  8.822e-05  24.007
## mig_group[0.0622,0.18]:scale(pblack_1)    1.319e-04  1.149e-04   1.147
## mig_group(0.18,0.227]:scale(pblack_1)    -1.363e-04  1.404e-04  -0.971
## mig_group(0.227,0.504]:scale(pblack_1)    9.340e-05  1.453e-04   0.643
## mig_group[0.0622,0.18]:scale(phisp)      -8.396e-04  1.184e-04  -7.092
## mig_group(0.18,0.227]:scale(phisp)       -9.079e-04  1.304e-04  -6.962
## mig_group(0.227,0.504]:scale(phisp)      -8.952e-04  1.386e-04  -6.461
## mig_group[0.0622,0.18]:I(RUCC >= 7)TRUE   3.227e-04  2.125e-04   1.519
## mig_group(0.18,0.227]:I(RUCC >= 7)TRUE   -2.944e-04  2.044e-04  -1.441
## mig_group(0.227,0.504]:I(RUCC >= 7)TRUE  -3.302e-04  1.901e-04  -1.737
##                                          Pr(>|t|)    
## (Intercept)                               < 2e-16 ***
## mig_group(0.18,0.227]                    0.205722    
## mig_group(0.227,0.504]                   0.067872 .  
## mig_group[0.0622,0.18]:scale(ppersonspo) 0.026740 *  
## mig_group(0.18,0.227]:scale(ppersonspo)  0.000156 ***
## mig_group(0.227,0.504]:scale(ppersonspo) 2.65e-05 ***
## mig_group[0.0622,0.18]:scale(p65plus)     < 2e-16 ***
## mig_group(0.18,0.227]:scale(p65plus)      < 2e-16 ***
## mig_group(0.227,0.504]:scale(p65plus)     < 2e-16 ***
## mig_group[0.0622,0.18]:scale(pblack_1)   0.251779    
## mig_group(0.18,0.227]:scale(pblack_1)    0.332149    
## mig_group(0.227,0.504]:scale(pblack_1)   0.520587    
## mig_group[0.0622,0.18]:scale(phisp)      4.74e-12 ***
## mig_group(0.18,0.227]:scale(phisp)       1.10e-11 ***
## mig_group(0.227,0.504]:scale(phisp)      2.54e-10 ***
## mig_group[0.0622,0.18]:I(RUCC >= 7)TRUE  0.129412    
## mig_group(0.18,0.227]:I(RUCC >= 7)TRUE   0.150336    
## mig_group(0.227,0.504]:I(RUCC >= 7)TRUE  0.082976 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001116 on 483 degrees of freedom
## Multiple R-squared:  0.8257, Adjusted R-squared:  0.8196 
## F-statistic: 134.6 on 17 and 483 DF,  p-value: < 2.2e-16
anova(fit.1, fit.strat2)
## Analysis of Variance Table
## 
## Model 1: I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + scale(pblack_1) + 
##     scale(phisp) + I(RUCC >= 7)
## Model 2: I(deaths/population) ~ mig_group/(scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7))
##   Res.Df        RSS Df Sum of Sq      F    Pr(>F)    
## 1    495 0.00066533                                  
## 2    483 0.00060171 12 6.362e-05 4.2557 2.056e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(fit.strat2, fit.strat)
## Analysis of Variance Table
## 
## Model 1: I(deaths/population) ~ mig_group/(scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7))
## Model 2: I(deaths/population) ~ STATE/(scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7))
##   Res.Df        RSS Df  Sum of Sq      F    Pr(>F)    
## 1    483 0.00060171                                   
## 2    471 0.00054848 12 5.3229e-05 3.8092 1.477e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(spdat, main="Neighbors for low migration counties")
plot(nbs1,coords=coordinates(spdat[spdat$mig_group==lvs[1],]), col=2, add=T)

plot(spdat, main="Neighbors for medium migration counties")
plot(nbs2,coords=coordinates(spdat[spdat$mig_group==lvs[2],]), col=2, add=T)

plot(spdat, main="Neighbors for high migration counties")
plot(nbs3,coords=coordinates(spdat[spdat$mig_group==lvs[3],]), col=2, add=T)

wts<-nb2listw(nbs)
wts1<-nb2listw(nbs1)
wts2<-nb2listw(nbs2)
wts3<-nb2listw(nbs3)

So we see non-stationarity in several of the covariates, with respect to state. The effects of poverty, age structure and race/ethnicty all vary significantly by state. Similar results are also seen for the migration rate grouping variable, althought the final model comparison of the two stratified models shows more variation between states than between migration groups.

If we wanted, we could stratify our analysis by state. Below, I simply stratify the analysis by state and use the OLS model.

fit.ar<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, subset=STATE=="05")
summary(fit.ar)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = STATE == "05")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.356e-03 -4.771e-04  3.499e-05  5.518e-04  2.209e-03 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.0096818  0.0005626  17.210  < 2e-16 ***
## scale(ppersonspo)  0.0006526  0.0002108   3.097  0.00283 ** 
## scale(p65plus)     0.0019531  0.0001645  11.871  < 2e-16 ***
## scale(pblack_1)    0.0002350  0.0001326   1.773  0.08072 .  
## scale(phisp)      -0.0017834  0.0007656  -2.329  0.02277 *  
## I(RUCC >= 7)TRUE   0.0002247  0.0002492   0.902  0.37037    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0008916 on 69 degrees of freedom
## Multiple R-squared:  0.8106, Adjusted R-squared:  0.7969 
## F-statistic: 59.08 on 5 and 69 DF,  p-value: < 2.2e-16
fit.la<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, subset=STATE=="22")
summary(fit.la)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = STATE == "22")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0035436 -0.0003792 -0.0000223  0.0005726  0.0018121 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        8.684e-03  1.663e-03   5.222 2.50e-06 ***
## scale(ppersonspo)  3.219e-04  2.212e-04   1.455    0.151    
## scale(p65plus)     2.355e-03  2.375e-04   9.916 4.25e-14 ***
## scale(pblack_1)    9.331e-05  1.627e-04   0.573    0.569    
## scale(phisp)      -3.092e-03  2.181e-03  -1.417    0.162    
## I(RUCC >= 7)TRUE   3.250e-04  3.047e-04   1.067    0.291    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0009234 on 58 degrees of freedom
## Multiple R-squared:  0.7974, Adjusted R-squared:  0.7799 
## F-statistic: 45.65 on 5 and 58 DF,  p-value: < 2.2e-16
fit.ok<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, subset=STATE=="40")
summary(fit.ok)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = STATE == "40")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.029e-03 -5.192e-04 -5.854e-05  6.775e-04  2.030e-03 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.0116220  0.0003860  30.106  < 2e-16 ***
## scale(ppersonspo)  0.0008200  0.0001755   4.671 1.38e-05 ***
## scale(p65plus)     0.0025302  0.0001688  14.993  < 2e-16 ***
## scale(pblack_1)    0.0001879  0.0004152   0.452    0.652    
## scale(phisp)       0.0001186  0.0005281   0.225    0.823    
## I(RUCC >= 7)TRUE  -0.0002993  0.0002681  -1.116    0.268    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0009816 on 71 degrees of freedom
## Multiple R-squared:  0.8346, Adjusted R-squared:  0.8229 
## F-statistic: 71.65 on 5 and 71 DF,  p-value: < 2.2e-16
fit.nm<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, subset=STATE=="35")
summary(fit.nm)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = STATE == "35")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0044801 -0.0007503 -0.0000290  0.0008573  0.0043913 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.0095430  0.0021307   4.479 0.000123 ***
## scale(ppersonspo)  0.0005528  0.0002949   1.874 0.071753 .  
## scale(p65plus)     0.0015753  0.0002571   6.126 1.52e-06 ***
## scale(pblack_1)    0.0006666  0.0031883   0.209 0.835951    
## scale(phisp)       0.0001852  0.0003545   0.522 0.605598    
## I(RUCC >= 7)TRUE  -0.0005091  0.0006664  -0.764 0.451516    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001638 on 27 degrees of freedom
## Multiple R-squared:  0.6221, Adjusted R-squared:  0.5521 
## F-statistic:  8.89 on 5 and 27 DF,  p-value: 4.382e-05
fit.tx<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, subset=STATE=="48")
summary(fit.tx)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = STATE == "48")
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0042947 -0.0005506  0.0000912  0.0006799  0.0029490 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.034e-02  1.101e-04  93.921  < 2e-16 ***
## scale(ppersonspo)  2.606e-04  1.144e-04   2.278 0.023568 *  
## scale(p65plus)     2.228e-03  8.809e-05  25.292  < 2e-16 ***
## scale(pblack_1)    4.049e-04  1.680e-04   2.410 0.016702 *  
## scale(phisp)      -4.991e-04  1.373e-04  -3.636 0.000337 ***
## I(RUCC >= 7)TRUE  -3.663e-04  1.676e-04  -2.186 0.029786 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.00111 on 246 degrees of freedom
## Multiple R-squared:  0.8453, Adjusted R-squared:  0.8422 
## F-statistic: 268.9 on 5 and 246 DF,  p-value: < 2.2e-16

Or, we could run a SAR model for each subset using the weights we created above:

#first the "global" model
efit<-lagsarlm(scale(I(deaths/population))~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat, listw=wts)

Then, a regime specification as used by Brazil 2015

efit0<-lagsarlm(scale(I(deaths/population))~mig_group/(scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7)), data=spdat, listw=wts)
summary(efit0)
## 
## Call:
## lagsarlm(formula = scale(I(deaths/population)) ~ mig_group/(scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7)), data = spdat, listw = wts)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.2535080 -0.2175824  0.0092613  0.2420036  1.2534095 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                                           Estimate Std. Error z value
## (Intercept)                               0.070702   0.052020  1.3591
## mig_group(0.18,0.227]                     0.062960   0.069376  0.9075
## mig_group(0.227,0.504]                   -0.144586   0.069608 -2.0771
## mig_group[0.0622,0.18]:scale(ppersonspo)  0.089667   0.041622  2.1543
## mig_group(0.18,0.227]:scale(ppersonspo)   0.212126   0.057090  3.7156
## mig_group(0.227,0.504]:scale(ppersonspo)  0.161934   0.044400  3.6472
## mig_group[0.0622,0.18]:scale(p65plus)     0.613779   0.047683 12.8721
## mig_group(0.18,0.227]:scale(p65plus)      0.791811   0.047329 16.7298
## mig_group(0.227,0.504]:scale(p65plus)     0.784719   0.033044 23.7478
## mig_group[0.0622,0.18]:scale(pblack_1)    0.052671   0.042112  1.2507
## mig_group(0.18,0.227]:scale(pblack_1)    -0.042097   0.051436 -0.8184
## mig_group(0.227,0.504]:scale(pblack_1)    0.038860   0.053224  0.7301
## mig_group[0.0622,0.18]:scale(phisp)      -0.284882   0.044298 -6.4311
## mig_group(0.18,0.227]:scale(phisp)       -0.301421   0.048620 -6.1995
## mig_group(0.227,0.504]:scale(phisp)      -0.282886   0.052171 -5.4223
## mig_group[0.0622,0.18]:I(RUCC >= 7)TRUE   0.086956   0.078148  1.1127
## mig_group(0.18,0.227]:I(RUCC >= 7)TRUE   -0.121716   0.075011 -1.6226
## mig_group(0.227,0.504]:I(RUCC >= 7)TRUE  -0.154449   0.069837 -2.2116
##                                           Pr(>|z|)
## (Intercept)                              0.1741000
## mig_group(0.18,0.227]                    0.3641342
## mig_group(0.227,0.504]                   0.0377878
## mig_group[0.0622,0.18]:scale(ppersonspo) 0.0312148
## mig_group(0.18,0.227]:scale(ppersonspo)  0.0002027
## mig_group(0.227,0.504]:scale(ppersonspo) 0.0002651
## mig_group[0.0622,0.18]:scale(p65plus)    < 2.2e-16
## mig_group(0.18,0.227]:scale(p65plus)     < 2.2e-16
## mig_group(0.227,0.504]:scale(p65plus)    < 2.2e-16
## mig_group[0.0622,0.18]:scale(pblack_1)   0.2110316
## mig_group(0.18,0.227]:scale(pblack_1)    0.4131111
## mig_group(0.227,0.504]:scale(pblack_1)   0.4653192
## mig_group[0.0622,0.18]:scale(phisp)      1.267e-10
## mig_group(0.18,0.227]:scale(phisp)       5.665e-10
## mig_group(0.227,0.504]:scale(phisp)      5.883e-08
## mig_group[0.0622,0.18]:I(RUCC >= 7)TRUE  0.2658359
## mig_group(0.18,0.227]:I(RUCC >= 7)TRUE   0.1046657
## mig_group(0.227,0.504]:I(RUCC >= 7)TRUE  0.0269975
## 
## Rho: 0.12088, LR test value: 16.799, p-value: 4.156e-05
## Asymptotic standard error: 0.028979
##     z-value: 4.1712, p-value: 3.0305e-05
## Wald statistic: 17.399, p-value: 3.0305e-05
## 
## Log likelihood: -264.3111 for lag model
## ML residual variance (sigma squared): 0.16721, (sigma: 0.40891)
## Number of observations: 501 
## Number of parameters estimated: 20 
## AIC: 568.62, (AIC for lm: 583.42)
## LM test for residual autocorrelation
## test value: 3.5091, p-value: 0.061031
anova(efit, efit0)
##       Model df    AIC  logLik Test L.Ratio    p-value
## efit      1  8 595.77 -289.89    1                   
## efit0     2 20 568.62 -264.31    2   51.15 8.7649e-07

Or, I could fit the fully stratified models, one regime at a time by subsetting and using the group-specific spatial weights:

efit1<-lagsarlm(scale(I(deaths/population))~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat[spdat$mig_group==lvs[1],], listw=wts1)

efit2<-lagsarlm(scale(I(deaths/population))~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat[spdat$mig_group==lvs[2],], listw=wts2)

efit3<-lagsarlm(scale(I(deaths/population))~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), data=spdat[spdat$mig_group==lvs[3],], listw=wts3)

summary(efit1)
## 
## Call:lagsarlm(formula = scale(I(deaths/population)) ~ scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7), data = spdat[spdat$mig_group == lvs[1], ], listw = wts1)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.4544851 -0.2654621 -0.0015307  0.2899709  1.0866101 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                    Estimate Std. Error z value  Pr(>|z|)
## (Intercept)       -0.044804   0.054807 -0.8175   0.41365
## scale(ppersonspo)  0.122484   0.054066  2.2655   0.02348
## scale(p65plus)     0.573820   0.048549 11.8195 < 2.2e-16
## scale(pblack_1)    0.054060   0.064939  0.8325   0.40515
## scale(phisp)      -0.401658   0.072147 -5.5672 2.588e-08
## I(RUCC >= 7)TRUE   0.102547   0.091158  1.1249   0.26062
## 
## Rho: 0.15548, LR test value: 6.4493, p-value: 0.0111
## Asymptotic standard error: 0.059236
##     z-value: 2.6247, p-value: 0.0086724
## Wald statistic: 6.8891, p-value: 0.0086724
## 
## Log likelihood: -103.2209 for lag model
## ML residual variance (sigma squared): 0.22757, (sigma: 0.47704)
## Number of observations: 151 
## Number of parameters estimated: 8 
## AIC: 222.44, (AIC for lm: 226.89)
## LM test for residual autocorrelation
## test value: 2.8762, p-value: 0.089897
summary(efit2)
## 
## Call:lagsarlm(formula = scale(I(deaths/population)) ~ scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7), data = spdat[spdat$mig_group == lvs[2], ], listw = wts2)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -1.244411 -0.258925 -0.022572  0.238146  1.121171 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                    Estimate Std. Error z value  Pr(>|z|)
## (Intercept)        0.061767   0.047187  1.3090    0.1905
## scale(ppersonspo)  0.170906   0.038992  4.3831 1.170e-05
## scale(p65plus)     0.778337   0.043870 17.7418 < 2.2e-16
## scale(pblack_1)   -0.031947   0.041736 -0.7655    0.4440
## scale(phisp)      -0.315580   0.044718 -7.0571 1.701e-12
## I(RUCC >= 7)TRUE  -0.133604   0.074314 -1.7978    0.0722
## 
## Rho: 0.10064, LR test value: 3.8633, p-value: 0.049354
## Asymptotic standard error: 0.050143
##     z-value: 2.007, p-value: 0.044746
## Wald statistic: 4.0282, p-value: 0.044746
## 
## Log likelihood: -77.42665 for lag model
## ML residual variance (sigma squared): 0.16374, (sigma: 0.40464)
## Number of observations: 150 
## Number of parameters estimated: 8 
## AIC: 170.85, (AIC for lm: 172.72)
## LM test for residual autocorrelation
## test value: 0.16968, p-value: 0.6804
summary(efit3)
## 
## Call:lagsarlm(formula = scale(I(deaths/population)) ~ scale(ppersonspo) + 
##     scale(p65plus) + scale(pblack_1) + scale(phisp) + I(RUCC >= 
##     7), data = spdat[spdat$mig_group == lvs[3], ], listw = wts3)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -1.384600 -0.188351  0.037029  0.229312  1.174578 
## 
## Type: lag 
## Coefficients: (asymptotic standard errors) 
##                    Estimate Std. Error z value  Pr(>|z|)
## (Intercept)        0.048072   0.038961  1.2338   0.21726
## scale(ppersonspo)  0.143595   0.035900  3.9998 6.339e-05
## scale(p65plus)     0.856224   0.036718 23.3190 < 2.2e-16
## scale(pblack_1)    0.023191   0.034032  0.6815   0.49558
## scale(phisp)      -0.207571   0.034207 -6.0680 1.295e-09
## I(RUCC >= 7)TRUE  -0.119329   0.067403 -1.7704   0.07666
## 
## Rho: 0.015616, LR test value: 0.12585, p-value: 0.72278
## Asymptotic standard error: 0.043516
##     z-value: 0.35886, p-value: 0.7197
## Wald statistic: 0.12878, p-value: 0.7197
## 
## Log likelihood: -97.39382 for lag model
## ML residual variance (sigma squared): 0.15505, (sigma: 0.39376)
## Number of observations: 200 
## Number of parameters estimated: 8 
## AIC: 210.79, (AIC for lm: 208.91)
## LM test for residual autocorrelation
## test value: 0.80749, p-value: 0.36886

5.1 GWR Model

First proposed by Brunsdon et al 1996

Their model was specified, where now each \(\beta_k\) is estimated at the location \(u_i,v_j\) where i and j are the coordinates or geographic location of the observation i. \(\beta_k(ui,vj)\) is the local realization of \(\beta\) at location i,j. This constructs a trend surface of parameter values for each independent variable and the model intercept.
Note that the basic OLS regression model above is just a special case of the GWR model where the coefficients are constant over space. The parameters in the GWR are estimated by weighted least squares. The weighting matrix is a diagonal matrix, with each diagonal element being a function of the location of the observation. if Wi is the weighting matrix at location i, then the parameter estimate at that location would be

\(\hat \beta_i = (X'WX)^{-1} X'WY\)

The role of the weight matrix is to give more value to observations that are close to i. as it is assumed that observations that are close will influence each other more than those that are far away. This is NOT the same W matrix as was used in Moran’s I!

W is generally constructed based on the distances between locations, and if \(d_ij\) < d , then \(w_{ij}\)=1, and 0 otherwise. Brunsdson, Charlton, and Fotheringham used a kernel-density approach to measure W

\(w_{ij} = exp(- \frac{1}{2}(d_{ij}/b)^2)\)

where b is the bandwidth parameter to be estimated. under this weighting scheme, the weight for points that exactly co-occur will be 1 and for those that are the distance \(d_{ij}\) apart, the weight will decrease as a Gaussian curve as \(d_{ij}\) increases.

GWR Kernels

GWR Kernels

There are several ways to do this, but most programs use a cross-validation method to choose the optimal kernel bandwidth. The cross validation method takes the form: \(CV = \Sigma_i \left[y_i - \hat y_{\neq i} (\beta) \right ]^2\)

Where \(\hat y_{\neq i} (\beta)\) is the fitted value of \(y_i\) with the observations for point i omitted from the calibration process. This in effect minimizes the sum of squared errors at all locations i, and arrives at an optimal bandwidth.

When the model is fit, we can also calculate local z/t tests to test whether the effect of a specifix X variable is significant at a particular location. Likewise, we can calculate a local \(R^2\) measure at each location.

Leung et al 2000 proposed a series of F tests for model improvement over OLS, and they also derive an F test for variation in the beta coefficients (so called F3 test.

5.1.1 GWR Modeling

The advantage of GWR is that we can visualize the difference in the covariate effects as a continuous surface. There are a few steps to fitting the GWR model. First we need to select a bandwidth for the weighting algorithm

gwr.b.f<-gwr.sel(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat)
## Bandwidth: 837305.2 CV score: 0.0006802827 
## Bandwidth: 1353436 CV score: 0.0006843992 
## Bandwidth: 518319 CV score: 0.000671332 
## Bandwidth: 321174.7 CV score: 0.0006543891 
## Bandwidth: 199332.9 CV score: 0.0006472186 
## Bandwidth: 124030.4 CV score: 0.0006733644 
## Bandwidth: 245872.3 CV score: 0.0006478826 
## Bandwidth: 210553.7 CV score: 0.0006470031 
## Bandwidth: 215074.7 CV score: 0.0006469975 
## Bandwidth: 213356.6 CV score: 0.0006469947 
## Bandwidth: 213425.2 CV score: 0.0006469947 
## Bandwidth: 213408.1 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213408 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947 
## Bandwidth: 213407.9 CV score: 0.0006469947
gwr.b.f
## [1] 213407.9
#this is the distance (in meters, because our data are projected in a system measured in meters), which the weighting fuction will search, and include all observations within this radius.

gwr.f<-gwr(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, bandwidth=gwr.b.f, se.fit=T, hatmatrix=T)
gwr.f
## Call:
## gwr(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     bandwidth = gwr.b.f, hatmatrix = T, se.fit = T)
## Kernel function: gwr.Gauss 
## Fixed bandwidth: 213407.9 
## Summary of GWR coefficient estimates at data points:
##                         Min.    1st Qu.     Median    3rd Qu.       Max.
## X.Intercept.       9.360e-03  1.027e-02  1.045e-02  1.080e-02  1.125e-02
## scale.ppersonspo. -2.531e-04  4.700e-04  6.296e-04  7.949e-04  9.485e-04
## scale.p65plus.     1.410e-03  2.060e-03  2.136e-03  2.220e-03  2.367e-03
## scale.pblack_1.   -1.134e-04  4.689e-05  1.157e-04  2.391e-04  2.934e-03
## scale.phisp.      -2.182e-03 -1.048e-03 -9.715e-04 -7.932e-04  3.420e-04
## I.RUCC....7.TRUE  -1.096e-03 -3.638e-04 -2.153e-04  5.367e-05  4.777e-04
##                    Global
## X.Intercept.       0.0105
## scale.ppersonspo.  0.0006
## scale.p65plus.     0.0021
## scale.pblack_1.    0.0001
## scale.phisp.      -0.0009
## I.RUCC....7.TRUE  -0.0002
## Number of data points: 501 
## Effective number of parameters (residual: 2traceS - traceS'S): 46.73469 
## Effective degrees of freedom (residual: 2traceS - traceS'S): 454.2653 
## Sigma (residual: 2traceS - traceS'S): 0.001059183 
## Effective number of parameters (model: traceS): 33.82928 
## Effective degrees of freedom (model: traceS): 467.1707 
## Sigma (model: traceS): 0.001044451 
## Sigma (ML): 0.001008572 
## AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): -5416.218 
## AIC (GWR p. 96, eq. 4.22): -5457.412 
## Residual sum of squares: 0.0005096261 
## Quasi-global R2: 0.8524083
BFC02.gwr.test(gwr.f)
## 
##  Brunsdon, Fotheringham & Charlton (2002, pp. 91-2) ANOVA
## 
## data:  gwr.f
## F = 1.3055, df1 = 495.00, df2 = 454.27, p-value = 0.001934
## alternative hypothesis: greater
## sample estimates:
## SS OLS residuals SS GWR residuals 
##     0.0006653254     0.0005096261
BFC99.gwr.test(gwr.f)
## 
##  Brunsdon, Fotheringham & Charlton (1999) ANOVA
## 
## data:  gwr.f
## F = 3.4071, df1 = 256.14, df2 = 470.49, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## SS GWR improvement   SS GWR residuals 
##       0.0001556992       0.0005096261
LMZ.F1GWR.test(gwr.f)
## 
##  Leung et al. (2000) F(1) test
## 
## data:  gwr.f
## F = 0.83467, df1 = 470.49, df2 = 495.00, p-value = 0.02385
## alternative hypothesis: less
## sample estimates:
## SS OLS residuals SS GWR residuals 
##     0.0006653254     0.0005096261
LMZ.F2GWR.test(gwr.f)
## 
##  Leung et al. (2000) F(2) test
## 
## data:  gwr.f
## F = 2.8438, df1 = 66.187, df2 = 495.000, p-value = 5.23e-11
## alternative hypothesis: greater
## sample estimates:
##   SS OLS residuals SS GWR improvement 
##       0.0006653254       0.0001556992
LMZ.F3GWR.test(gwr.f)
## 
## Leung et al. (2000) F(3) test
## 
##                   F statistic Numerator d.f. Denominator d.f.     Pr(>)
## (Intercept)            2.9132        54.3122           470.49 5.168e-10
## scale(ppersonspo)      3.7056       143.4837           470.49 < 2.2e-16
## scale(p65plus)         2.1047       106.2542           470.49 5.926e-08
## scale(pblack_1)        2.8647        42.1765           470.49 3.025e-08
## scale(phisp)           2.4810        46.2870           470.49 9.023e-07
## I(RUCC >= 7)TRUE       2.3018       205.7501           470.49 8.846e-14
##                      
## (Intercept)       ***
## scale(ppersonspo) ***
## scale(p65plus)    ***
## scale(pblack_1)   ***
## scale(phisp)      ***
## I(RUCC >= 7)TRUE  ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
gwr.dat<-gwr.f$SDF
cols<-brewer.pal(n=6, name="RdBu")
names(gwr.dat)
##  [1] "sum.w"                    "X.Intercept."            
##  [3] "scale.ppersonspo."        "scale.p65plus."          
##  [5] "scale.pblack_1."          "scale.phisp."            
##  [7] "I.RUCC....7.TRUE"         "X.Intercept._se"         
##  [9] "scale.ppersonspo._se"     "scale.p65plus._se"       
## [11] "scale.pblack_1._se"       "scale.phisp._se"         
## [13] "I.RUCC....7.TRUE_se"      "gwr.e"                   
## [15] "pred"                     "pred.se"                 
## [17] "localR2"                  "X.Intercept._se_EDF"     
## [19] "scale.ppersonspo._se_EDF" "scale.p65plus._se_EDF"   
## [21] "scale.pblack_1._se_EDF"   "scale.phisp._se_EDF"     
## [23] "I.RUCC....7.TRUE_se_EDF"  "pred.se_EDF"
#pdf("gwrfixed.pdf")
spplot(gwr.dat, "scale.ppersonspo.", at=quantile(gwr.f$SDF$scale.ppersonspo.), col.regions=cols, main="% Poverty effect")

spplot(gwr.dat, "scale.pblack_1.", at=quantile(gwr.f$SDF$scale.pblack_1.), col.regions=cols, main="% Black effect")

spplot(gwr.dat, "scale.phisp.", at=quantile(gwr.f$SDF$scale.phisp.), col.regions=cols, main="% Hispanic effect")

#another "lighter" plot method
#brks<-c(-3.038e-04,  3.441e-04,  5.882e-04,  7.912e-04 , 1.355e-03); cols<-brewer.pal(n=5,"Reds")
#plot(gwr.a$SDF, col=cols[findInterval(gwr.a$SDF$scale.ppersonspo.,brks, all.inside=TRUE)])

#Local R^2
brks<-seq(0, 1, length.out = 9); cols=brewer.pal(n=9, "Reds")
spplot(gwr.f$SDF, "localR2", at=brks, col.regions=cols, main="local R^2")

#plots of the correlations of the coefficients
pairs(gwr.f$SDF[,3:7])

cor(as(gwr.f$SDF[,3:7], "data.frame"))
##                   scale.ppersonspo. scale.p65plus. scale.pblack_1.
## scale.ppersonspo.       1.000000000   -0.001474693      -0.3823173
## scale.p65plus.         -0.001474693    1.000000000      -0.6196133
## scale.pblack_1.        -0.382317262   -0.619613324       1.0000000
## scale.phisp.           -0.650938401   -0.193689629       0.6597658
## I.RUCC....7.TRUE       -0.286463809    0.134305435      -0.2545913
##                   scale.phisp. I.RUCC....7.TRUE
## scale.ppersonspo.   -0.6509384       -0.2864638
## scale.p65plus.      -0.1936896        0.1343054
## scale.pblack_1.      0.6597658       -0.2545913
## scale.phisp.         1.0000000       -0.2432299
## I.RUCC....7.TRUE    -0.2432299        1.0000000

This plot shows a problem in GWR, that the coefficients can often be highly correlated See Wheeler and Tiefelsdorf (2005), one of the major critiques of GWR. In general, the critiques of GWR have pushed most users to use it for “exploratory” purposes only. So, one way we may explore our model is to examine where the regression effects are “significant” versus not. This can be done by plotting the local t-statistics, or z-statistics if you prefer.

cols<-brewer.pal(8,"RdBu")
dfree<-gwr.f$results$edf

names(gwr.dat)
##  [1] "sum.w"                    "X.Intercept."            
##  [3] "scale.ppersonspo."        "scale.p65plus."          
##  [5] "scale.pblack_1."          "scale.phisp."            
##  [7] "I.RUCC....7.TRUE"         "X.Intercept._se"         
##  [9] "scale.ppersonspo._se"     "scale.p65plus._se"       
## [11] "scale.pblack_1._se"       "scale.phisp._se"         
## [13] "I.RUCC....7.TRUE_se"      "gwr.e"                   
## [15] "pred"                     "pred.se"                 
## [17] "localR2"                  "X.Intercept._se_EDF"     
## [19] "scale.ppersonspo._se_EDF" "scale.p65plus._se_EDF"   
## [21] "scale.pblack_1._se_EDF"   "scale.phisp._se_EDF"     
## [23] "I.RUCC....7.TRUE_se_EDF"  "pred.se_EDF"
gwr.dat$pov.t<-gwr.dat$scale.ppersonspo./gwr.dat$scale.ppersonspo._se
gwr.dat$pov.t.p<-2*pt(-abs(gwr.dat$pov.t)  , dfree)
tbrks<- c(min(gwr.dat$pov.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.dat$pov.t))
spplot(gwr.dat, "pov.t", col.regions=cols, at=tbrks , main="t-stat %Poverty")

gwr.dat$black.t<-gwr.dat$scale.pblack_1./gwr.dat$scale.pblack_1._se
gwr.dat$black.t.p<-2*pt(-abs(gwr.dat$black.t)  , dfree)
tbrks<- c(min(gwr.dat$black.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.dat$black.t))
spplot(gwr.dat, "black.t", col.regions=cols,at=tbrks , main="t-stat %Black")

gwr.dat$hisp.t<-gwr.dat$scale.phisp./gwr.dat$scale.phisp._se
gwr.dat$hisp.t.p<-2*pt(-abs(gwr.dat$hisp.t)  , dfree)
tbrks<- c(min(gwr.dat$hisp.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.dat$hisp.t))
spplot(gwr.dat, "hisp.t", col.regions=cols, at=tbrks, main="t-stat %Hispanic")

So, those are the results from the fixed bandwith analysis. But there are obviously places in our data where counties are more densely occurring, so an adaptive kernel approach is also warranted. If all our data were equally spaced, then this may not be necessary.

Now we try everthing with the adaptive bandwidth:

gwr.b.a<-gwr.sel(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, adapt=T)
## Adaptive q: 0.381966 CV score: 0.0006708917 
## Adaptive q: 0.618034 CV score: 0.0006759116 
## Adaptive q: 0.236068 CV score: 0.0006662507 
## Adaptive q: 0.145898 CV score: 0.0006610212 
## Adaptive q: 0.09016994 CV score: 0.0006542459 
## Adaptive q: 0.05572809 CV score: 0.0006442366 
## Adaptive q: 0.03444185 CV score: 0.0006379112 
## Adaptive q: 0.02128624 CV score: 0.0006543154 
## Adaptive q: 0.04257247 CV score: 0.000642091 
## Adaptive q: 0.02941686 CV score: 0.0006457471 
## Adaptive q: 0.03754747 CV score: 0.0006396459 
## Adaptive q: 0.03252248 CV score: 0.0006416178 
## Adaptive q: 0.03543098 CV score: 0.0006381852 
## Adaptive q: 0.03389516 CV score: 0.0006379173 
## Adaptive q: 0.03419852 CV score: 0.0006378657 
## Adaptive q: 0.03415783 CV score: 0.0006378591 
## Adaptive q: 0.0340575 CV score: 0.0006378438 
## Adaptive q: 0.03399549 CV score: 0.0006378352 
## Adaptive q: 0.0339548 CV score: 0.00063783 
## Adaptive q: 0.0339548 CV score: 0.00063783
gwr.b.a
## [1] 0.0339548

This is the proportion of all cases which the weighting fuction will search, and include this fraction of observations in a model for each county. In this case it will include 17.0113545 counties in the model for a particular county.

gwr.a<-gwr(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7), spdat, adapt=gwr.b.a, se.fit=T, hatmatrix=T )
gwr.a
## Call:
## gwr(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     adapt = gwr.b.a, hatmatrix = T, se.fit = T)
## Kernel function: gwr.Gauss 
## Adaptive quantile: 0.0339548 (about 17 of 501 data points)
## Summary of GWR coefficient estimates at data points:
##                         Min.    1st Qu.     Median    3rd Qu.       Max.
## X.Intercept.       7.577e-03  9.934e-03  1.049e-02  1.089e-02  1.191e-02
## scale.ppersonspo. -3.038e-04  3.441e-04  5.882e-04  7.912e-04  1.355e-03
## scale.p65plus.     1.403e-03  1.998e-03  2.190e-03  2.333e-03  2.790e-03
## scale.pblack_1.   -3.075e-04  6.612e-05  2.370e-04  4.524e-04  2.271e-03
## scale.phisp.      -4.785e-03 -1.187e-03 -9.467e-04 -5.321e-04  7.882e-04
## I.RUCC....7.TRUE  -1.135e-03 -4.824e-04 -2.081e-04  1.074e-04  5.971e-04
##                    Global
## X.Intercept.       0.0105
## scale.ppersonspo.  0.0006
## scale.p65plus.     0.0021
## scale.pblack_1.    0.0001
## scale.phisp.      -0.0009
## I.RUCC....7.TRUE  -0.0002
## Number of data points: 501 
## Effective number of parameters (residual: 2traceS - traceS'S): 101.2227 
## Effective degrees of freedom (residual: 2traceS - traceS'S): 399.7773 
## Sigma (residual: 2traceS - traceS'S): 0.00104379 
## Effective number of parameters (model: traceS): 72.72109 
## Effective degrees of freedom (model: traceS): 428.2789 
## Sigma (model: traceS): 0.001008461 
## Sigma (ML): 0.0009324024 
## AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): -5396.638 
## AIC (GWR p. 96, eq. 4.22): -5497.204 
## Residual sum of squares: 0.0004355565 
## Quasi-global R2: 0.8738594
BFC02.gwr.test(gwr.a)
## 
##  Brunsdon, Fotheringham & Charlton (2002, pp. 91-2) ANOVA
## 
## data:  gwr.a
## F = 1.5275, df1 = 495.00, df2 = 399.78, p-value = 5.41e-06
## alternative hypothesis: greater
## sample estimates:
## SS OLS residuals SS GWR residuals 
##     0.0006653254     0.0004355565
BFC99.gwr.test(gwr.a)
## 
##  Brunsdon, Fotheringham & Charlton (1999) ANOVA
## 
## data:  gwr.a
## F = 2.2147, df1 = 394.70, df2 = 437.65, p-value = 4.244e-16
## alternative hypothesis: greater
## sample estimates:
## SS GWR improvement   SS GWR residuals 
##       0.0002297688       0.0004355565
LMZ.F1GWR.test(gwr.a)
## 
##  Leung et al. (2000) F(1) test
## 
## data:  gwr.a
## F = 0.81058, df1 = 437.65, df2 = 495.00, p-value = 0.01213
## alternative hypothesis: less
## sample estimates:
## SS OLS residuals SS GWR residuals 
##     0.0006653254     0.0004355565
LMZ.F2GWR.test(gwr.a)
## 
##  Leung et al. (2000) F(2) test
## 
## data:  gwr.a
## F = 1.7952, df1 = 149.55, df2 = 495.00, p-value = 1.501e-06
## alternative hypothesis: greater
## sample estimates:
##   SS OLS residuals SS GWR improvement 
##       0.0006653254       0.0002297688
LMZ.F3GWR.test(gwr.a)
## 
## Leung et al. (2000) F(3) test
## 
##                   F statistic Numerator d.f. Denominator d.f.     Pr(>)
## (Intercept)            1.4040        16.7158           437.65  0.131199
## scale(ppersonspo)      1.9580       141.0856           437.65 1.094e-07
## scale(p65plus)         1.7019       124.0449           437.65 4.939e-05
## scale(pblack_1)        1.5254        60.3445           437.65  0.009694
## scale(phisp)           1.0542        18.9957           437.65  0.396847
## I(RUCC >= 7)TRUE       1.4327       177.0345           437.65  0.001656
##                      
## (Intercept)          
## scale(ppersonspo) ***
## scale(p65plus)    ***
## scale(pblack_1)   ** 
## scale(phisp)         
## I(RUCC >= 7)TRUE  ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
gwr.data<-gwr.a$SDF
spplot(gwr.data, "scale.ppersonspo.", at=quantile(gwr.a$SDF$scale.ppersonspo.), col.regions=cols, main="% Poverty effect")

spplot(gwr.data, "scale.pblack_1.", at=quantile(gwr.a$SDF$scale.pblack_1.), col.regions=cols, main="% Black effect")

spplot(gwr.data, "scale.phisp.", at=quantile(gwr.a$SDF$scale.phisp.), col.regions=cols, main="% Hispanic effect")

cols<-brewer.pal(8,"RdBu")
dfree<-gwr.a$results$edf

names(gwr.data)
##  [1] "sum.w"                    "X.Intercept."            
##  [3] "scale.ppersonspo."        "scale.p65plus."          
##  [5] "scale.pblack_1."          "scale.phisp."            
##  [7] "I.RUCC....7.TRUE"         "X.Intercept._se"         
##  [9] "scale.ppersonspo._se"     "scale.p65plus._se"       
## [11] "scale.pblack_1._se"       "scale.phisp._se"         
## [13] "I.RUCC....7.TRUE_se"      "gwr.e"                   
## [15] "pred"                     "pred.se"                 
## [17] "localR2"                  "X.Intercept._se_EDF"     
## [19] "scale.ppersonspo._se_EDF" "scale.p65plus._se_EDF"   
## [21] "scale.pblack_1._se_EDF"   "scale.phisp._se_EDF"     
## [23] "I.RUCC....7.TRUE_se_EDF"  "pred.se_EDF"
gwr.data$pov.t<-gwr.data$scale.ppersonspo./gwr.data$scale.ppersonspo._se
gwr.data$pov.t.p<-2*pt(-abs(gwr.data$pov.t)  , dfree)
tbrks<- c(min(gwr.data$pov.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.data$pov.t))
spplot(gwr.data, "pov.t", col.regions=cols, at=tbrks , main="t-stat %Poverty")

gwr.data$black.t<-gwr.data$scale.pblack_1./gwr.data$scale.pblack_1._se
gwr.data$black.t.p<-2*pt(-abs(gwr.data$black.t)  , dfree)
tbrks<- c(min(gwr.data$black.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.data$black.t))
spplot(gwr.data, "black.t", col.regions=cols,at=tbrks , main="t-stat %Black")

gwr.data$hisp.t<-gwr.data$scale.phisp./gwr.data$scale.phisp._se
gwr.data$hisp.t.p<-2*pt(-abs(gwr.data$hisp.t)  , dfree)
tbrks<- c(min(gwr.data$hisp.t),qt(c( .975, .9, .1, .025),df=dfree, lower.tail = F),max(gwr.data$hisp.t))
spplot(gwr.data, "hisp.t", col.regions=cols, at=tbrks, main="t-stat %Hispanic")

#Local R^2
brks<-seq(0, 1, length.out = 9); cols=brewer.pal(n=9, "Reds")
spplot(gwr.a$SDF, "localR2", at=brks, col.regions=cols, main="local R^2")

6 Alternative Construction of spatial regimes - Clustering on GWR coefficients

The first step in the construction of spatial regimes is to do a hierarchical clustering method. We use a distance matrix between all observations for this In this case, we’re using the GWR coefficients as our data.

dat.dist<-dist(gwr.data@data[, 3:7])

#The hclust() function does basic hierarchical clustering
#according to several different methods, we'll use Ward's method
clust.dat<-hclust(dat.dist, method="ward.D")

#And we'll plot the dendrogram, or tree plot
plot(clust.dat)

#I only want a few groups, so I'll cut the tree so I get 5 clusters
gwr.data$clus<-cutree(clust.dat, k=5)


#i'll use table() to get the frequencies of each cluster
table(gwr.data$clus)
## 
##   1   2   3   4   5 
##  60 269  25  90  57
#to get it to plot right, we have to convert the cluster number
#to a factor variable
gwr.data$b.cf<-as.factor(gwr.data$clus)
spplot(gwr.data,"b.cf", col.regions=brewer.pal(4, "Accent"), main="GWR Spatial Regimes")

#We can get the means of each coefficient for each of the regime areas
b.means<-aggregate(gwr.data@data[, 3:5], by=list(gwr.data$clus), mean)
b.means
##   Group.1 scale.ppersonspo. scale.p65plus. scale.pblack_1.
## 1       1      0.0005078355    0.002188644    0.0002581329
## 2       2      0.0006852921    0.002200900    0.0001550988
## 3       3      0.0007126707    0.001891788    0.0001570193
## 4       4      0.0002219154    0.002300691    0.0004603948
## 5       5      0.0005660539    0.001853176    0.0013099481
#The main idea behind examining spatial regimes is that we are looking for areas where the model may be working differently, we've done this, now we will
#fit separate ols models to each regime
spdat$cluster<-gwr.data$clus
lm1<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$cluster==1)
lm2<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$cluster==2)
lm3<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$cluster==3)
lm4<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$cluster==4)

#examine each of the spatial regimes
summary(lm1)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$cluster == 1)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -1.968e-03 -5.651e-04 -6.159e-05  4.298e-04  2.714e-03 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        8.594e-03  1.050e-03   8.188 4.92e-11 ***
## scale(ppersonspo)  5.464e-04  1.713e-04   3.190  0.00237 ** 
## scale(p65plus)     2.076e-03  1.752e-04  11.852  < 2e-16 ***
## scale(pblack_1)    2.074e-04  1.319e-04   1.573  0.12167    
## scale(phisp)      -3.071e-03  1.409e-03  -2.180  0.03363 *  
## I(RUCC >= 7)TRUE  -5.581e-05  2.872e-04  -0.194  0.84666    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.000884 on 54 degrees of freedom
## Multiple R-squared:  0.8255, Adjusted R-squared:  0.8094 
## F-statistic:  51.1 on 5 and 54 DF,  p-value: < 2.2e-16
summary(lm2)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$cluster == 2)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0043029 -0.0005601  0.0000598  0.0006509  0.0029322 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.041e-02  1.040e-04 100.131  < 2e-16 ***
## scale(ppersonspo)  7.790e-04  1.087e-04   7.166 7.79e-12 ***
## scale(p65plus)     2.075e-03  8.251e-05  25.144  < 2e-16 ***
## scale(pblack_1)   -1.279e-04  1.052e-04  -1.216    0.225    
## scale(phisp)      -1.287e-03  1.277e-04 -10.082  < 2e-16 ***
## I(RUCC >= 7)TRUE  -2.334e-04  1.620e-04  -1.441    0.151    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001093 on 263 degrees of freedom
## Multiple R-squared:  0.8347, Adjusted R-squared:  0.8315 
## F-statistic: 265.5 on 5 and 263 DF,  p-value: < 2.2e-16
summary(lm3)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$cluster == 3)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0013514 -0.0005655 -0.0001463  0.0007187  0.0014008 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)   
## (Intercept)        0.0105650  0.0035702   2.959  0.00806 **
## scale(ppersonspo)  0.0018243  0.0006118   2.982  0.00766 **
## scale(p65plus)     0.0013615  0.0004114   3.309  0.00369 **
## scale(pblack_1)   -0.0006823  0.0004084  -1.671  0.11121   
## scale(phisp)      -0.0004845  0.0047703  -0.102  0.92017   
## I(RUCC >= 7)TRUE   0.0010489  0.0005555   1.888  0.07439 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0008556 on 19 degrees of freedom
## Multiple R-squared:  0.861,  Adjusted R-squared:  0.8244 
## F-statistic: 23.54 on 5 and 19 DF,  p-value: 1.563e-07
summary(lm4)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$cluster == 4)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0041216 -0.0005345  0.0000171  0.0006476  0.0022513 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.025e-02  1.868e-04  54.847   <2e-16 ***
## scale(ppersonspo)  2.915e-06  1.564e-04   0.019    0.985    
## scale(p65plus)     2.285e-03  1.562e-04  14.629   <2e-16 ***
## scale(pblack_1)    2.999e-04  2.097e-04   1.430    0.156    
## scale(phisp)      -2.728e-04  1.970e-04  -1.385    0.170    
## I(RUCC >= 7)TRUE  -3.323e-05  2.976e-04  -0.112    0.911    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001087 on 84 degrees of freedom
## Multiple R-squared:  0.7977, Adjusted R-squared:  0.7857 
## F-statistic: 66.24 on 5 and 84 DF,  p-value: < 2.2e-16

Or, we could cluster on the original attributes in the data to find similar places based on their demographic variables:

dem.dist<-dist(spdat@data[, c("ppersonspo", "p65plus", "pblack_1", "phisp")])

#The hclust() function does basic hierarchical clustering
#according to several different methods, we'll use Ward's method
clust.dem<-hclust(dem.dist, method="ward.D")

#And we'll plot the dendrogram, or tree plot
plot(clust.dem)

#I only want a few groups, so I'll cut the tree so I get 4 clusters
spdat$clus<-cutree(clust.dem, k=4)


#i'll use table() to get the frequencies of each cluster
table(spdat$clus)
## 
##   1   2   3   4 
## 127 244 104  26
#to get it to plot right, we have to convert the cluster number
#to a factor variable
spdat$d.cf<-as.factor(spdat$clus)
spplot(spdat,"d.cf", col.regions=brewer.pal(4, "Accent"), main="Demographic Spatial Regimes")

#We can get the means of each coefficient for each of the regime areas
b.means<-aggregate(spdat@data[, c("ppersonspo", "p65plus", "pblack_1", "phisp")], by=list(spdat$d.cf), mean)
b.means
##   Group.1 ppersonspo   p65plus   pblack_1      phisp
## 1       1  0.2065583 0.1360936 0.29850591 0.04920953
## 2       2  0.1531682 0.1637101 0.03967098 0.08239738
## 3       3  0.1849537 0.1378306 0.03679615 0.41140663
## 4       4  0.3034971 0.1186762 0.01008154 0.83316154
#fit separate ols models to each regime

lm1<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$d.cf==1)
lm2<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$d.cf==2)
lm3<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$d.cf==3)
lm4<-lm(I(deaths/population)~scale(ppersonspo)+scale(p65plus)+scale(pblack_1)+scale(phisp)+I(RUCC>=7),spdat, subset=spdat$d.cf==4)

#examine each of the spatial regimes
summary(lm1)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$d.cf == 1)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0039289 -0.0004871 -0.0000248  0.0005462  0.0017959 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.024e-02  2.337e-04  43.807  < 2e-16 ***
## scale(ppersonspo)  4.255e-04  1.365e-04   3.117 0.002280 ** 
## scale(p65plus)     2.424e-03  1.239e-04  19.567  < 2e-16 ***
## scale(pblack_1)    7.774e-05  1.303e-04   0.597 0.551800    
## scale(phisp)      -1.226e-03  3.557e-04  -3.448 0.000778 ***
## I(RUCC >= 7)TRUE   3.041e-04  1.873e-04   1.624 0.107040    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0008633 on 121 degrees of freedom
## Multiple R-squared:  0.8521, Adjusted R-squared:  0.8459 
## F-statistic: 139.4 on 5 and 121 DF,  p-value: < 2.2e-16
summary(lm2)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$d.cf == 2)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0040465 -0.0005578  0.0000415  0.0006278  0.0030116 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        1.059e-02  1.782e-04  59.414  < 2e-16 ***
## scale(ppersonspo)  7.103e-04  1.047e-04   6.785 9.12e-11 ***
## scale(p65plus)     2.244e-03  8.082e-05  27.762  < 2e-16 ***
## scale(pblack_1)    4.529e-04  2.613e-04   1.733   0.0843 .  
## scale(phisp)      -1.304e-03  2.199e-04  -5.929 1.06e-08 ***
## I(RUCC >= 7)TRUE  -4.113e-04  1.725e-04  -2.384   0.0179 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001128 on 238 degrees of freedom
## Multiple R-squared:  0.8312, Adjusted R-squared:  0.8277 
## F-statistic: 234.4 on 5 and 238 DF,  p-value: < 2.2e-16
summary(lm3)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$d.cf == 3)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -0.0039624 -0.0005800 -0.0000028  0.0007099  0.0027897 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        0.0100181  0.0003369  29.739  < 2e-16 ***
## scale(ppersonspo)  0.0006943  0.0001774   3.914 0.000168 ***
## scale(p65plus)     0.0014065  0.0001416   9.935  < 2e-16 ***
## scale(pblack_1)    0.0001337  0.0004301   0.311 0.756568    
## scale(phisp)      -0.0005599  0.0002939  -1.905 0.059702 .  
## I(RUCC >= 7)TRUE  -0.0003410  0.0002562  -1.331 0.186315    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001126 on 98 degrees of freedom
## Multiple R-squared:  0.6031, Adjusted R-squared:  0.5828 
## F-statistic: 29.78 on 5 and 98 DF,  p-value: < 2.2e-16
summary(lm4)
## 
## Call:
## lm(formula = I(deaths/population) ~ scale(ppersonspo) + scale(p65plus) + 
##     scale(pblack_1) + scale(phisp) + I(RUCC >= 7), data = spdat, 
##     subset = spdat$d.cf == 4)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -2.272e-03 -8.606e-04 -5.617e-05  9.027e-04  2.041e-03 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)   
## (Intercept)        0.0067730  0.0028478   2.378  0.02747 * 
## scale(ppersonspo) -0.0006504  0.0002674  -2.432  0.02452 * 
## scale(p65plus)     0.0018583  0.0005558   3.343  0.00324 **
## scale(pblack_1)    0.0002712  0.0033340   0.081  0.93598   
## scale(phisp)       0.0010968  0.0010453   1.049  0.30658   
## I(RUCC >= 7)TRUE   0.0004633  0.0005691   0.814  0.42514   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.001295 on 20 degrees of freedom
## Multiple R-squared:  0.5752, Adjusted R-squared:  0.469 
## F-statistic: 5.417 on 5 and 20 DF,  p-value: 0.002613